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Chapter  1 
Introduction 


1.1  Background 

Multichannel  signal  detection  is  encountered  in  a  wide  variety  of  applications.  In  radar  systems, 
sensor  arrays  are  often  used  to  facilitate  the  so-called  space-time  adaptive  processing  (STAP), 
which  offers  enhanced  target  discrimination  capability  compared  with  space-  or  time-only  pro¬ 
cessing  [1,2].  In  remote  sensing  systems,  multispectral  and  hyperspectral  sensors  are  used  to 
collect  spectral  information  across  multiple  spectral  bands,  which  can  be  exploited  for  classifica¬ 
tion  of  different  materials  or  detection  of  man-made  objects  on  the  ground  [3,4].  Other  examples 
of  applications  include  wireless  communications,  geolocation,  sonars,  audio  and  speech  process¬ 
ing,  and  seismology  [5-7]. 

STAP  based  multichannel  signal  detectors  have  been  successfully  utilized  to  mitigate  the  ef¬ 
fects  of  clutter  and/or  interference  in  radar,  remote  sensing,  and  communication  systems  [1-5]. 
The  first  work  on  STAP  for  radar  was  by  Brennan  and  Reed  [8]  where  optimum  space-time  filter¬ 
ing  was  presented.  In  [9],  the  Reed,  Mallett,  and  Brennan  (RMB)  detector  was  proposed.  They 
substitute  the  sample  covariance  matrix  for  the  true  covariance  matrix  in  the  optimal  solution. 
This  RMB  detector  is  implemented  as  follows:  1)  estimate  a  covariance  matrix  from  target- free 
training  data,  2)  determine  a  weight  vector  from  the  obtained  covariance  matrix  estimate,  3)  form 
a  test  statistic  and  compare  it  with  a  threshold  for  signal  detection.  This  RMB  detector  is  not 
a  constant  false  alarm  rate  (CFAR)  detector,  since  the  probability  of  false  alarm  depends  on  the 
true  covariance  matrix.  Simple  modification  to  this  RMB  detector  leads  to  the  CFAR  adpative 
matched  filter  (AMF)  detector  [10-12].  In  addition,  there  are  several  well-known  STAP  detectors 
such  as  Kelly’s  generalized  likelihood  ratio  test  (GLRT)  [13],  the  adaptive  coherence  estimator 
(ACE)  detector  [14-16],  and  so  forth. 

However,  the  aforementioned  conventional  STAP  detectors  involve  estimating  and  inverting 
a  large-size  space-time  covariance  matrix  of  the  disturbance  signal  (viz.,  clutter,  jamming,  and 
noise)  for  each  test  cell  using  target-free  training  data.  This  may  impose  excessive  training  and 
computational  burdens  when  the  joint  space-time  dimension  is  large.  At  a  minimum,  we  need 
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K  >  JN  training  data1  to  ensure  a  full-rank  estimate  of  the  JN  x  JN  space-time  covariance 
matrix,  where  J  denotes  the  number  of  spatial  channels  and  N  the  number  of  temporal  observa¬ 
tions.  Moreover,  the  RMB  rule  [9]  suggests  that,  in  average,  K  >  (2  JN  —  3)  training  data  are 
needed  to  obtain  performance  within  3  dB  from  the  optimum  bound.  Such  conditions  may  not  be 
satisfied,  especially  in  heterogeneous  (due  to  varying  terrain,  high  platform  altitude,  bistatic  ge¬ 
ometry,  conformal  array,  among  others)  or  dense-target  environments  that  offer  limited  training, 
thus  rendering  covariance  matrix  based  techniques  inapplicable. 

The  problem  of  target  detection  with  limited  training  data  support  has  gained  great  attention 
in  recent  decades  [17-21].  Typical  cases  in  airborne  radar  systems  include  dense-target  and/or 
heterogeneous  environments.  In  either  case,  the  detection  performance  of  the  conventional  STAP 
detectors  degrades  significantly  because  of  the  mismatch  of  the  space-time  covariance  matrix 
estimate  relative  to  that  of  the  target  test  cell. 

Parametric  model  based  STAP  detectors  have  also  gained  significant  interest  in  recent  years. 
[22-25].  In  many  applications,  however,  the  disturbance  usually  exhibits  certain  spatial,  tempo¬ 
ral,  and/or  spectral  structures  that  can  be  exploited  to  reduce  the  number  of  unknowns  and  ease 
the  training/computational  burden.  Among  other  alternatives,  one  general  structured  approach  is 
to  model  the  disturbance  as  a  multichannel  autoregressive  (AR)  process,  which  has  been  found 
to  be  very  useful  in  representing  the  spatial  and  temporal  correlation  of  radar  signals  [22-25].  A 
parametric  detector  based  on  such  a  multichannel  AR  model  was  developed  in  [22,23],  which 
is  referred  to  as  the  parametric  adaptive  matched  filter  (PAMF)  detector.  The  PAMF  detector 
has  been  shown  to  significantly  outperform  the  aforementioned  covariance  matrix-based  detec¬ 
tors  for  small  training  size  at  reduced  complexity.  Specifically,  the  PAMF  detector  models  the 
disturbance  as  a  multichannel  AR  process  driven  by  a  temporally  white  but  spatially  colored 
multichannel  noise.  While  traditional  STAP  detectors  perform  joint  space-time  whitening  (using 
an  estimate  of  the  space-time  covariance  matrix),  the  PAMF  detector  adopts  a  two-step  approach 
that  involves  temporal  whitening  via  an  inverse  moving-average  (MA)  filter  followed  by  spatial 
whitening.  The  parameters  that  need  to  be  estimated  are  the  AR  coefficient  matrices  and  the 
spatial  covariance  matrix  of  the  driving  multichannel  noise,  which  are  significantly  fewer  than 
what  are  involved  in  estimating  the  space-time  covariance  matrix.  This  is  the  essence  behind  the 
training  and  computational  efficiency  of  the  PAMF  detector. 

Although  intuitively  sound,  the  PAMF  detector  was  obtained  in  a  heuristic  approach  by  mod¬ 
ifying  the  AMF  test  statistic.  Specifically,  it  replaces  the  joint  space-time  whitening  incurred 
by  the  AMF  detector  with  two  separate  whitening  procedures  in  time  and  space  as  discussed 
above.  The  test  threshold,  false  alarm,  and  detection  probabilities  were  determined  primarily  by 
computer  simulation,  due  to  the  limited  analysis  of  the  PAMF  detector.  Moreover,  the  afore¬ 
mentioned  detectors  use  only  the  target-free  training  data  for  the  estimation  of  the  space-time 
covariance  matrix.  Then,  a  test  statistic  is  formed  and  compared  with  a  test  threshold.  Thus,  it 
renders  the  detectors  inapplicable  in  the  case  where  the  target-free  training  data  are  unavailable. 

1 K  >  JN  —  1  training  data  are  needed  if  both  the  test  and  training  data  are  used  to  estimate  the  space-time 
covariance  matrix. 
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1.2  Summary  of  Work 


This  report  examines  the  problem  of  detecting  a  multichannel  signal  in  the  presence  of  spatially 
and  temporally  colored  disturbance.  The  main  research  objective  is  to  develop  multichannel 
parametric  detectors  by  modeling  the  disturbance  as  a  multichannel  AR  process  and  exploiting 
this  parametric  model  in  a  GLRT  approach.  The  joint  probability  density  functions  (PDFs)  of  the 
test  and  training  data  under  the  null  and  alternative  hypotheses  are  used  to  develop  the  parametric 
detectors,  just  like  what  Kelly  did  in  his  seminal  paper  [13]. 2 

A  first  contribution  of  this  report  is  the  development  of  a  parametric  Rao  test  for  the  multi¬ 
channel  signal  detection  problem.  A  generic  Rao  test  is  known  to  offer  a  standard  solution  to  a 
class  of  parameter  testing  problems.  It  is  easier  to  derive  and  implement  than  the  GLRT,  and  is 
also  asymptotically  (large-sample  in  the  number  of  temporal  observations  and/or  training  data) 
equivalent  to  the  latter.  The  Rao  test  was  recently  used  to  develop  detectors  for  several  other 
problems  [26,27].  A  detailed  discussion  on  the  attributes  of  a  generic  Rao  test  can  be  found 
in  [7], 

Our  parametric  Rao  test  differs  from  the  generic  one  for  multichannel  signal  detection  in 
that  we  make  explicit  use  of  a  multichannel  AR  model  for  the  disturbance  signal.  We  show 
that,  interestingly,  the  parametric  Rao  test  takes  a  form  identical  to  that  of  the  PAMF  detector. 
The  only  difference  is  that  we  use  a  maximum  likelihood  (ML)  based  estimator  that  involves 
using  both  test  and  training  signals  for  parameter  estimation,  whereas  the  estimators  in  [23]  use 
only  training  signals  for  that  purpose.  If  the  ML  estimator  is  utilized,  the  parametric  Rao/PAMF 
detector  is  asymptotically  a  parametric  GLRT.  The  asymptotic  distribution  of  the  test  statistic 
under  both  hypotheses  is  obtained  in  closed  form,  which  can  be  used  to  set  the  test  threshold 
and  compute  the  corresponding  detection  and  false  alarm  probabilities.  Since  the  asymptotic 
distribution  under  the  null  hypothesis  is  independent  of  the  unknown  parameters,  the  parametric 
Rao/PAMF  detector  asymptotically  achieves  constant  false  alarm  rate  (CFAR).  Numerical  results 
are  presented,  which  show  that  our  asymptotic  results  are  accurate  in  predicting  the  performance 
of  the  Rao/PAMF  detector  even  when  the  data  size  is  modest. 

A  second  contribution  of  this  report  is  the  development  of  a  parametric  GLRT.  It  is  natural 
to  extend  the  results  of  the  parametric  Rao  test  and  consider  the  parametric  GLRT  for  several 
reasons.  First,  the  problem  of  interest  is  a  two-sided  parameter  testing  problem  that  admits  no 
uniformly  most  powerful  (UMP)  solution  [7].  A  GLRT  approach  is  widely  used  in  such  cases 
due  to  its  good  asymptotic  properties  including  asymptotic  CFAR  and  consistency.  Second,  the 
parametric  GLRT  may  yield  improved  performance  than  the  parametric  Rao  detector,  especially 
when  the  data  is  limited.  This  is  because  the  latter  is  an  asymptotic  (large- sample)  parametric 
GLRT  [7,  Appendix  6B].  Third,  all  Rao  tests,  including  the  parametric  Rao  detector,  are  obtained 
based  on  a  further  approximation  that  is  valid  only  for  weak  signals  [7,  p.  238].  As  such,  the 
parametric  Rao  detector  is  expected  to  degrade  when  the  weak  signal  assumption  is  violated. 
The  above  observations  motivate  us  to  consider  the  parametric  GLRT,  in  hope  of  finding  a  better 
solution  to  the  problem. 

By  developing  the  parametric  Rao  and  GLRT  detectors,  we  also  investigate  the  underlying 
2It  is  noted  that  Kelly’s  GLRT  does  not  utilize  the  parametric  model  for  the  disturbance. 
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ML  parameter  estimation.  The  parametric  GLRT  relies  on  ML  parameter  estimation  for  both 
the  null  and  alternative  hypotheses  while  the  parametric  Rao  test  depends  only  on  ML  parame¬ 
ter  estimation  under  the  null  hypothesis.  We  show  that  the  ML  estimator  under  the  alternative 
hypothesis  is  non-linear  and  requires  searches  on  a  two-dimensional  parameter  space.  To  ad¬ 
dress  this  issue,  we  introduce  an  asymptotic  ML  (AML)  estimator  that  is  considerably  simpler, 
yielding  estimates  in  a  non-iterative  fashion,  and  asymptotically  coincides  with  the  optimum  ML 
estimator.  The  Cramer-Rao  Bound  (CRB)  for  the  estimation  problem  is  also  derived,  offering  a 
baseline  for  comparing  various  (unbiased)  estimators. 

A  third  contribution  of  this  report  is  a  simplified  parametric  GLRT.  The  parametric  GLRT  de¬ 
tector  involves  a  highly  nonlinear  maximum  likelihood  estimation  procedure,  which  was  solved 
via  a  two-dimensional  iterative  search  method  initialized  by  a  suboptimal  estimator  [28].  To  facil¬ 
itate  the  detection,  a  simplified  GLRT  along  with  a  new  estimator  is  proposed.  Both  the  estimator 
and  the  GLRT  are  derived  in  closed-form  at  considerably  lower  complexity.  With  adequate  train¬ 
ing  data,  the  new  GLRT  achieves  a  similar  detection  performance  as  the  original  one.  However, 
for  the  more  interesting  case  of  limited  training,  the  original  GLRT  may  become  inferior  due  to 
poor  initialization.  Because  of  its  simpler  form,  the  new  GLRT  also  offers  additional  insight  into 
the  parametric  multichannel  signal  detection  problem.  The  performance  of  the  proposed  detector 
is  assessed  using  both  a  simulated  dataset,  which  was  generated  using  multichannel  AR  models, 
and  the  KASSPER  dataset,  a  widely  used  dataset  with  challenging  heterogeneous  effects  found 
in  real-world  environments. 

A  fourth  contribution  of  this  report  is  the  recursive  implementation  of  parametric  Rao  and 
parametric  GLRT  detectors.  The  parametric  Rao  and  parametric  GLRT  detectors  assume  that 
the  model  order  of  the  multichannel  AR  process  is  known  a  priori  to  the  detector.  In  practice, 
the  model  order  has  to  be  estimated  by  some  model  order  selection  technique,  Meanwhile,  a 
standard  non-recursive  implementation  of  the  parametric  detectors  is  computationally  intensive 
since  the  unknown  parameters  have  to  be  estimated  for  all  possible  model  orders  before  the  best 
one  is  identified.  To  address  these  issues,  we  consider  joint  model  order  selection,  parameter 
estimation,  and  target  detection  problem.  We  present  recursive  versions  of  the  aforementioned 
parametric  detectors  by  integrating  the  multichannel  Levinson  algorithm,  which  is  employed  for 
recursive  and  computationally  efficient  parameter  estimation  [29,30],  with  a  generalized  Akaike 
Information  Criterion  (GAIC)  for  model  order  selection  [30,31].  Numerical  results  show  that  the 
proposed  recursive  parametric  detectors  yield  a  detection  performance  nearly  identical  to  that  of 
their  non-recursive  counterparts  at  significantly  reduced  complexity. 

A  fifth  contribution  of  this  report  is  performance  evaluation  of  parametric  space-time  adap¬ 
tive  detectors  with  more  challenging  datasets.  The  parametric  detectors  was  studied  under  the 
assumption  that  the  disturbance  follows  a  multichannel  AR  process.  However,  the  disturbance 
signals  in  an  airborne  radar  environment  do  not  necessarily  follow  an  exact  multichannel  AR 
model.  In  this  chapter,  the  parametric  Rao  and  GLRT  detectors  are  studied  using  a  number  of 
datasets:  1)  simulated  data  -  Knowledge-Aided  Sensor  Signal  Processing  and  Expert  Reason¬ 
ing  (KASSPER)  dataset  that  was  generated  by  a  high-fidelity  clutter  model;  2)  measured  data 
-  Multi-Channel  Airborne  Radar  Measurement  (MCARM)  dataset  that  was  collected  from  an 
L-band  airborne  phased  array  radar  testbed;  and  3)  bistatic  data  that  was  generated  based  on  a 
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bistatic  geometry.  These  datasets  contain  many  real-world  effects  such  as  heterogeneous  terrains, 
antenna  errors  and  leakage,  dense  ground  targets/discretes,  and  range-dependent  clutter  (bistatic 
geometry).  Experimental  results  show  that  the  parametric  detectors  can  provide  good  detection 
performance  with  limited  or  no  range  training  even  in  more  realistic  radar  environments. 

The  last  chapter  of  this  report  talks  about  application  of  parametric  adaptive  signal  detection 
to  hyperspectral  imaging.  Traditional  approaches  to  the  so-called  subpixel  target  signal  detec¬ 
tion  problem  are  training-inefficient  due  to  the  need  for  an  estimate  of  a  large-size  covariance 
matrix  of  the  background  from  target-free  training  pixels.  This  imposes  a  training  requirement 
that  is  often  difficult  to  meet  in  a  heterogeneous  environment.  A  class  of  training-efficient  adap¬ 
tive  signal  detectors  is  presented  by  exploiting  a  parametric  model  that  takes  into  account  the 
non-stationarity  of  HSI  data  in  the  spectral  dimension.  A  maximum  likelihood  (ML)  estimator  is 
developed  to  estimate  the  parameters  associated  with  the  proposed  parametric  model.  Several  im¬ 
portant  issues  are  discussed,  including  model  order  selection,  training  screening,  and  time-series 
based  whitening  and  detection,  which  are  intrinsic  parts  of  the  proposed  parametric  adaptive  de¬ 
tectors.  Experimental  results  using  real  HSI  data  reveal  that  the  proposed  parametric  detectors 
are  more  training-efficient  and  outperform  conventional  covariance-matrix  based  detectors  when 
the  training  size  is  limited. 

The  remainder  of  this  report  is  organized  as  follows.  Chapter  2  briefly  reviews  the  STAP 
for  airborne  radar  detection.  Chapters  3,  4  and  5  present  the  parametric  Rao  test,  parametric 
GLRT  and  the  simplified  parametric  GLRT,  respectively.  Details  of  the  technical  developments 
of  the  results  reported  in  Chapters  3,  4  and  5  are  found  in  Appendices  A,  B  and  C,  respectively. 
Chapter  6  presents  the  recursive  versions  of  the  parametric  detectors.  Performance  evaluation  of 
parametric  space-time  adaptive  detection  with  more  challenging  datasets  is  introduced  in  Chapter 
7.  And  Chapter  8  discusses  application  of  parametric  adaptive  signal  detection  to  hyperspectral 
imaging. 

Notation:  Vectors  (matrices)  are  denoted  by  boldface  lower  (upper)  case  letters,  all  vectors 
are  column  vectors,  superscripts  (-)T,  (•)*,  and  (■)n  denote  transpose,  complex  conjugate,  and 
complex  conjugate  transpose,  respectively,  I  m  denotes  the  MxM  identity  matrix  (with  subscript 
suppressed  sometimes),  CJ\f(p,  R)  denotes  the  multivariate  complex  Gaussian  distribution  with 
mean  vector  p,  and  covariance  matrix  R,  ||  ■  ||  takes  the  Frobenius  norm  of  a  matrix/vector, 
C)  denotes  the  Kronecker  product,  vec(-)  denotes  the  operation  of  stacking  the  columns  of  a 
matrix  on  top  of  each  other,  C  denotes  the  complex  number  field,  R { • }  takes  the  real  part  of 
the  argument,  and  A{-}  takes  the  imaginary  part,  and  finally,  (-)t  denotes  the  Moore-Penrose 
pseudo-inverse. 
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Chapter  2 

Review  of  Space-time  Adaptive  Processing 


Adaptive  temporal  processing  has  been  used  for  cancellation  of  weather  clutter  [32].  Adaptive 
spatial  processing  or  array  processing  has  been  used  for  interference  nulling  in  radar.  Adaptive 
array  processing  techniques  can  provide  nulling  far  below  the  sidelobe  level  limitation,  but  special 
measures  must  be  taken  to  avoid  the  inclusion  of  mainlobe  clutter  in  an  adaptive  beamforming 
radar  [1], 

STAP  simultaneously  combines  and  processes  signals  received  on  multiple  channels  (the  spa¬ 
tial  domain)  and  from  multiple  pulse  repetition  periods  (the  temporal  domain).  Therefore,  it  is 
useful  whenever  the  received  signals  (target  and  interference)  are  functions  of  both  space  and 
time  [32].  STAP  is  an  efficient  tool  for  slow  target  detection  in  strong  clutter  environments,  and 
provides  robustness  to  system  errors  and  capability  to  handle  non-stationary  interference  [1]. 

In  this  chapter,  we  focus  on  STAP  for  airborne  radar  detection.  However,  it  should  be  noted 
that  STAP  has  broad  applications  in  radar,  sonar,  remote  sensing,  and  communication  systems 
[1-5].  In  addition,  the  research  results  obtained  in  this  dissertation  can  also  be  applicable  to  other 
areas,  e.g.,  sonar,  biomedical  imaging,  remote  sensing,  wireless  communications,  and  so  on. 


2.1  Phased  Array  Radar 

A  linear  phased  array  airborne  radar  with  J  antennas  is  considered  herein  [1],  The  radar  transmits 
a  coherent  burst  of  N  pulses  at  a  constant  pulse  repetition  frequency  (PRF)  fr  =  1/Tr,  where 
Tr  is  the  pulse  repetition  interval  (PRI).  The  transmitter  frequency  is  fQ  =  c/X0,  where  c  is 
the  propagation  velocity  and  Ac  is  the  radar  operating  wavelength.  The  waveform  returns  are 
collected  over  the  time  interval,  referred  to  as  the  coherent  processing  interval  (CPI).  For  each 
PRI,  Kt  time  (range)  samples  are  collected  to  cover  the  range  interval.  The  received  data  is 
organized  in  a  J  x  N  x  KT  CPI  data  cube. 

The  test  data,  or  primary  data  consist  of  J  x  N  samples  from  a  single  range  gate.  These 
samples  are  arranged  in  a  column  vector  x0.  The  training  data,  or  secondary  data  consist  of  K 
range  gates,  forming  a  subset  of  the  Kt  —  1  remaining  ones.  These  are  denoted  by  x/,.,  k  = 

1.2  . K. 
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Given  the  test  and  training  data,  the  detection  problem  can  be  expressed  as  a  binary  hypothesis 
testing: 


H0  :  x0  =  d0, 

Hi  :  x0  =  as  +  d0, 


(2.1) 


where  s  is  the  target  steering  vector,  a  is  an  unknown  amplitude,  and  d0  is  the  disturbance  data 
that  may  be  correlated  in  space  and  time.  In  addition  to  the  test  data  x0,  there  may  be  a  set  of 
training  data  xfc,  k  —  1, . . . ,  K,  that  are  target-free:  xfe  =  dfc.  The  disturbance  data  (dfc}j[L0  are 
assumed  to  be  independent  and  identically  distributed  (i.i.d.)  with  distribution  CAf( 0,  R),  where 
R  is  the  unknown  space-time  covariance  matrix. 

For  a  uniform  equi-distant  linear  array,  the  steering  vector  takes  the  form  [23]: 


s  ss(cu5)  ^  s^(cu^),  (2.2) 

where  sa(u;s)  denotes  the  J  x  1  spatial  steering  vector: 

sa(ua)  =  -j=[l,  e**,  ...,  (2.3) 

and  St(u>d)  denotes  the  Ar  x  1  temporal  steering  vector: 

st(ujd)  =  -jL  [1,  ejuJ\  ...,  e^d(iv-1)jT)  (2.4) 

where  u>s  and  c ud  denote  the  normalized  target  spatial  and  Doppler  frequencies,  respectively. 


2.2  Non-parametric  STAP  Approaches 

The  space-time  processor  linearly  combines  all  the  samples  from  the  range  gate  of  interest  and 
produces  a  scalar  output  [1]: 

T=  |W^xo|2|7,  (2.5) 

H0 

where  7  is  the  test  threshold. 

The  space-time  processor  consists  of  three  major  steps.  The  first  step  is  the  training  strategy 
which  selects  the  training  data  from  the  given  data  cube.  The  second  step  is  the  weight  vector 
computation.  Based  on  the  training  data,  the  adaptive  weight  vector  is  computed.  The  final  step 
is  the  weight  vector  application,  which  computes  a  scalar  output,  or  test  statistic.  The  output  of 
the  processor  is  compared  with  a  test  threshold  to  determine  whether  a  target  is  present  or  not. 

2.2.1  Matched  Filter  Detector 

If  the  space-time  covariance  matrix,  denoted  by  R  is  known  exactly,  the  optimum  detector  that 
maximizes  the  output  SINR  is  the  matched  filter  (MF)  [12]: 

,  H  1 2  |sflR-1X0h  Hx 

TMf  =  WMFX0  =  — „  , - 7mf,  (2.6) 

s"R  's  h0 
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where 


(2.7) 


R 

WMF  —  7  „  =) 

V  s//R^1s 

7mf  denotes  the  MF  threshold.  The  MF  detector  is  obtained  by  a  GLRT  approach  (e.g.,  [7]),  by 
which  the  ML  estimate  of  the  unknown  amplitude  a  is  first  estimated  and  then  substituted  back 
into  the  likelihood  ratio  to  form  a  test  statistic.  It  should  be  noted  that  the  MF  detector  cannot 
be  implemented  in  real  applications  since  R  is  unknown.  However,  it  provides  a  baseline  for 
performance  comparison  when  considering  any  realizable  detection  scheme. 

2.2.2  Sample  Matrix  Inversion  Approach 

In  practice,  the  unknown  R  should  be  replaced  by  some  estimate,  such  as  the  sample  covariance 
matrix  obtained  from  the  secondary  data  [9] : 

1  K 

R  =  —  xfcxf .  (2.8) 

Y  k= 1 

Using  R  and  the  adaptive  weight  vector  obtained  by 

wrmb  =  R’s,  (2.9) 

gives  the  RMB  detector  [9].  It  is  also  known  as  the  sample  matrix  inversion  (SMI)  approach.  The 
RMB  detector,  however,  lacks  CFAR. 


2.2.3  Adaptive  Matched  Filter  Detector 

A  modification  to  the  RMB  detector  leads  to  the  so-called  AMF  detector  [10-12]: 


7~amf  — 


s"R  x0 


s^R 


Hi 

7amf, 

H0 


(2.10) 


where  7amf  denotes  the  AMF  threshold.  It  is  noted  that  the  AMF  test  statistic  exhibits  a  CFAR 
property. 


2.2.4  Kelly’s  GLRT 


Alternatively,  one  can  treat  both  a  and  R  as  unknowns  and  estimate  them  successively  by  ML. 
Such  a  GLRT  approach  was  pursued  by  Kelly  [13],  which  gives  the  following  Kelly  test: 


7~Kelly  — 


SWR  1X0 


s^R  1s)  [K  +  x^R_1x0 


Hi 

^  TKelly, 
Ho 


where  7Keiiy  denotes  the  corresponding  threshold. 
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(2.11) 


2.2.5  Adaptive  Coherence  Estimator  Detector 


Conte  et  al.  [33,34]  and  Scharf  et  al.  [14]  have  proposed  another  GLRT  based  detection  scheme 
which  is  the  so-called  Adaptive  Coherence  Estimator  (ACE)  detector: 


Tace  — 


swR  1x0 


s^R_1s)  (x^R_1x0 


Hx 

7ace, 

H0 


(2.12) 


where  7ace  denotes  the  ACE  threshold.  Clearly,  the  ACE  detector  is  a  normalized  version  of  the 
AMF  detector.  It  is  noted  that  the  ACE  test  statistic  lies  between  0  and  1. 


2.2.6  Drawbacks  of  Non-parametric  STAP  Detectors 

The  AMF,  ACE,  and  Kelly’  GLRT  detectors  have  a  CFAR  property,  which  is  a  highly  desirable 
feature  of  STAP  detectors.  However,  they  also  entail  a  large  training  requirement.  In  particular, 
the  sample  covariance  matrix  R  has  to  be  inverted,  which  imposes  a  constraint  on  the  training 
size 

K  >  JN  (2.13) 

to  ensure  a  full-rank  R.  The  RMB  rule  [9]  suggests  that  at  least 

K  >  (2JN  —  3)  (2.14) 

target-free  secondary  data  vectors  are  needed  to  obtain  an  expected  performance  within  3  dB  from 
the  optimum  MF  detector.  Such  a  training  requirement  may  be  difficult  to  meet,  especially  in 
non-homogeneous  or  dense-target  environments.  Besides  excessive  training,  the  computational 
complexity  of  these  detectors  is  also  high,  since  R  has  to  be  computed  and  inverted  for  each  CPI. 


2.3  Parametric  STAP  Approaches 

The  aforementioned  difficulties  are  primarily  because  for  large  JN  and  in  the  absence  of  any 
specific  structure,  R  involves  an  enormous  number  of  unknowns.  In  many  applications,  however, 
the  disturbance  usually  exhibits  certain  spatial,  temporal,  and/or  spectral  structures  that  can  be 
exploited  to  reduce  the  number  of  unknowns  and  ease  the  training/computational  burden.  Among 
other  alternatives,  one  general  structured  approach  is  to  model  the  disturbance  as  a  multichannel 
AR  process,  which  has  been  found  to  be  very  useful  in  representing  the  spatial  and  temporal 
correlation  of  radar  signals  [22-25]. 

2.3.1  Parametric  Adaptive  Matched  Filter  Detector 

Using  both  simulated  and  real  data,  the  PAMF  detector  [22,23]  has  been  shown  to  significantly 
outperform  the  aforementioned  covariance  matrix-based  detectors  for  small  training  data  size  at 
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a  reduced  complexity.  While  covariance  matrix-based  detectors  perform  joint  space-time  whiten¬ 
ing  for  interference  mitigation,  the  PAMF  detector  adopts  a  two-step  approach,  involving  tempo¬ 
ral  whitening  via  an  inverse  moving-average  (MA)  filter  followed  by  spatial  whitening. 

The  PAMF  detector  is  given  by  [23] 


TpAMF  — 


En=p  ,p(n) 


Pi 

7;  7pamf, 
Ho 


(2.15) 


where  QP  denotes  an  estimate  of  the  spatial  covariance  matrix  Q,  x0iP(n)  and  sP(n)  are  the 
temporally  whitened  test  signal  and  steering  vector,  respectively;  which  are  whitened  using  an 
inverse  AR  filter  of  order  P  (i.e.,  a  multichannel  MA  filter)  whose  parameters,  along  with  QP, 
are  estimated  from  the  secondary  data.  The  parametric  approach  offers  savings  in  both  training 
and  computation,  since  the  parameters  to  be  estimated  are  significantly  fewer,  compared  with 
covariance  matrix-based  approaches. 

However,  when  the  training  data  are  subject  to  outlier  contamination,  the  performance  of 
the  PAMF  detector  degrades  severely.  Moreover,  the  PAMF  detector  was  obtained  in  a  heuris¬ 
tic  approach  by  modifying  the  AMF  test  statistic.  Specifically,  it  replaces  the  joint  space-time 
whitening  incurred  by  the  AMF  detector  with  two  separate  whitening  procedures  in  time  and 
space  as  discussed  earlier.  The  test  threshold,  false  alarm,  and  detection  probabilities  were  deter¬ 
mined  primarily  by  computer  simulation,  due  to  the  limited  theoretical  analysis  available  for  the 
PAMF  detector. 


2.3.2  Swindlehurst  and  Stoica’s  GLRT 

Swindlehurst  et  al.  have  presented  the  GLRT  in  [24].  They  considered  a  different  detection  prob¬ 
lem  that  involves  unknown  non-linear  signal  parameters  associated  with  the  signal  to  be  detected. 
In  their  problem,  the  steering  vector  is  parameterzied  by  unknown  Doppler  and  direction  of  ar¬ 
rival  (DOA).  They  have  also  utilized  the  vector  AR  model  for  the  disturbance,  which  is  spatially 
and  temporally  correlated.  Their  vector  AR  (VAR)  approach  uses  only  training  data  for  parameter 
estimation.  In  other  words,  they  treat  estimation  and  detection  as  separate  processes,  as  opposed 
to  the  joint  processing  used  in  Kelly’s  GLRT. 
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Chapter  3 


Parametric  Rao  Test  for  Multichannel 
Adaptive  Signal  Detection 

3.1  Introduction 

In  this  chapter,  we  develop  a  parametric  Rao  test  for  the  multichannel  signal  detection.  A  generic 
Rao  test  is  known  to  offer  a  standard  solution  to  a  class  of  parameter  testing  problems.  It  is 
easier  to  derive  and  implement  than  the  GLRT,  and  is  also  asymptotically  (large-sample  in  the 
number  of  temporal  observations  and/or  training  data)  equivalent  to  the  latter.  The  Rao  test  was 
recently  used  to  develop  detectors  for  several  other  problems  [26,27].  A  detailed  discussion  on 
the  attributes  of  a  generic  Rao  test  can  be  found  in  [7]. 

Our  parametric  Rao  test  differs  from  the  generic  one  for  multichannel  signal  detection  in 
that  we  make  explicit  use  of  a  multichannel  AR  model  for  the  disturbance  signal.  We  show 
that,  interestingly,  the  parametric  Rao  test  takes  a  form  identical  to  that  of  the  PAMF  detector. 
The  only  difference  is  that  we  use  a  maximum  likelihood  (ML)  based  estimator  that  involves 
using  both  test  and  training  signals  for  parameter  estimation,  whereas  the  estimators  in  [23]  use 
only  training  signals  for  that  purpose.  If  the  ML  estimator  is  utilized,  the  parametric  Rao/PAMF 
detector  is  asymptotically  a  parametric  GLRT.  Under  the  conditions  stated  in  Section  7.2,  the 
asymptotic  distribution  of  the  test  statistic  under  both  hypotheses  is  obtained  in  closed  form, 
which  can  be  used  to  set  the  test  threshold  and  compute  the  corresponding  detection  and  false 
alarm  probabilities.  Since  the  asymptotic  distribution  under  H0  is  independent  of  the  unknown 
parameters,  the  parametric  Rao/PAMF  detector  asymptotically  achieves  constant  false  alarm  rate 
(CFAR).  Numerical  results  are  presented,  which  show  that  our  asymptotic  results  are  accurate  in 
predicting  the  performance  of  the  Rao/PAMF  detector  even  when  the  data  size  is  modest. 

The  remainder  of  this  chapter  is  organized  as  follows.  Section  7.2  contains  the  data  model  and 
problem  statement.  Our  main  results  are  summarized  in  Section  3.3.  In  particular,  Section  3.3.1 
contains  a  summary  of  the  parametric  test  statistic,  while  Section  3.3.2  includes  our  asymptotic 
analysis.  Details  of  the  technical  developments  of  the  results  reported  in  Section  3.3  are  found  in 
Appendices  A.l  to  A. 3.  Numerical  results  are  presented  in  Section  3.4  followed  by  concluding 
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remarks  in  Section  3.5. 


3.2  Data  Model  and  Problem  Statement 


The  problem  under  consideration  involves  detecting  a  known  multichannel  signal  with  unknown 
amplitude  in  the  presence  of  spatially  and  temporally  correlated  disturbance  (e.g.,  [1]): 

H0  :  x0(n)  =  d(n),  n  =  0, 1, . . . ,  N  -  lf 

H\  :  x0(n)  =  Qs(n)  +  d(n),  n  —  0, 1, . . . ,  N  —  1, 


where  all  vectors  are  J  x  1  with  J  denoting  the  number  of  spatial  channels,  and  N  is  the  number 
of  temporal  observations.  Henceforth,  x0(n)  is  called  the  test  signal,  s(n)  is  the  signal  to  be 
detected  with  amplitude  a,  and  d(n)  is  the  disturbance  signal  that  may  be  correlated  in  space 
and  time.  In  addition  to  the  test  signal,  it  is  assumed  that  a  set  of  target-free  training  or  secondary 
data  vectors  xfc(n),  k  =  1,  2, . . . ,  K  and  n  —  0, 1, . . . ,  N  —  1,  are  available  to  assist  in  the  signal 
detection  process. 

Define  the  following  JN  x  1  space-time  vectors: 


s  =  [sT(0),  sT(l),  ...,  sT  (N  —  1)]T , 
d  =  [dT(0),  dT(l),  ...,  dT(iV-l)f, 
xfc  —  [xfc(0),  x£(l),  •••,  x^(iV-l)]T, 
k  —  0, 1, ... ,  K. 

Equation  (7.1)  can  be  more  compactly  written  as 

H0:  x0  =  d, 

H\  :  x0  =  as  +  d. 


(3.2) 


(3.3) 


Clearly,  the  composite  hypothesis  testing  problem  (7.1)  or  (3.3)  is  also  a  two-sided  parameter 
testing  problem  that  tests  a  =  0  against  a  /  0.  The  general  assumptions  in  the  literature  are 
(e.g.,  [1,9-13, 15, 16,22,23]), 


•  AS1:  The  signal  vector  s  is  deterministic  and  known  to  the  detector; 

•  AS2:  The  signal  amplitude  a  is  complex- valued,  deterministic,  and  unknown ; 

•  AS3:  The  secondary  data  {x*.}^  and  the  disturbance  signal  d  (equivalently,  x0  under  H0) 
are  independent  and  identically  distributed  (i.i.d.)  with  distribution  CJ\f(0,  R),  where  R  is 
the  unknown  space-time  covariance  matrix. 

In  particular,  the  above  signal  detection  problem  occurs  in  an  airborne  STAP  radar  system 
with  J  array  channels  and  a  coherent  processing  interval  (CPI)  of  N  pulse  repetition  intervals 
(PRIs).  The  disturbance  d(n)  consists  of  ground  clutter,  jamming,  and  thermal  noise,  while 
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s(n)  is  called  the  target  space-time  steering  vector.  For  a  uniform  equi-spaced  linear  array,1  the 
steering  vector  is  given  by  [23]: 


S (u8,ud)  =  SS((VS)  <g>  8t(ud),  (3.4) 

where  s.s(ay)  denotes  the  J  x  1  spatial  steering  vector: 

ss{us)  =  ~^=  [1,  (3.5) 

and  st(u>d)  denotes  the  JV  x  1  temporal  steering  vector: 

st(ud)  =  -j=[l,  e?u*,  •••,  e?"*N-V]T,  (3.6) 

where  lus  and  ud  denote  the  normalized  target  spatial  and  Doppler  frequencies,  respectively. 

While  Assumptions  AS1  to  AS3  are  standard  (e.g.,  [1,9-13, 15, 16]),  we  further  assume  the 
following: 


•  AS4:  The  disturbance  signal  d(n)  can  be  modeled  as  a  multichannel  AR(P)  process  with 
known  model  order  P  but  unknown  AR  coefficient  matrices  and  spatial  covariance  (see 
Remark  1  below  for  additional  comments  on  this  assumption). 


Based  on  Assumption  AS4,  the  secondary  data  (xfc}^1  are  represented  as 


Xfc(n) 


p 

Ea"  (p)xfc(n  -p)  +  ek(n), 
p= i 


k  =  1,2,--  -  , K, 


(3.7) 


where  {Arr(p)}rp,_l  denote  the  J  x  J  AR  coefficient  matrices,  and  ek(n)  denote  the  driving 
multi-channel  spatial  noise  vectors  that  are  temporally  white  but  spatially  colored  Gaussian  noise: 
£kip)  ~  CN( 0,  Q),  where  Q  denotes  the  J  x  J  spatial  covariance  matrix.  Meanwhile,  the  test 
signal  x0  is  given  by 

x0(n)  —  as(n) 

p ^  (3 

=  -  aH(p)  (xo(n  -p)  -  as(n  -p)}  +  e0(n), 
p= i 

where  a  =  0  under  H0,  a  /  0  under  Hi,  and  e0 (n)  ~  CJ\f(0.  Q).  Let  s (n)  denote  a  regression 
on  s (n)  and  x0(n)  a  regression  on  x0(n)  under  Hp. 

p 

s (n)  =  s (n)  +  ^  AH(p)s(n  —  p),  (3.9) 

p=  i 
p 

x0(n)  =  xo(n)  +  ^  AH(p)x.0(n  -  p).  (3.10) 

p=i 

'Note  that  the  results  presented  in  this  section  apply  to  any  array  configurations,  as  long  as  the  steering  vector  is 
known  (cf.  Assumption  AS1). 
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Then,  the  driving  noise  in  (3.8)  can  be  alternatively  expressed  as 

£o(n)  =  x0(n)  —  cts(n).  (3.11) 

The  problem  of  interest  is  to  develop  a  decision  rule  for  the  above  composite  hypothesis  test¬ 
ing  problem  using  the  test  and  training  signals  as  well  as  exploiting  the  multichannel  parametric 
AR  model. 

Remark  1:  We  shall  clarify  that  our  goal  here  is  not  to  justify  whether  AR  models  are  appro¬ 
priate  or  not  for  STAP  applications.  An  answer  to  the  question  can  be  found  in  [23],  where  it  is 
shown  that  low-order  multichannel  AR  models  are  very  powerful  and  efficient  in  capturing  the 
temporal  and  spatial  correlation  of  the  disturbance  and,  hence,  can  greatly  help  signal  detection 
in  airborne  STAP  systems.  As  stated  above,  our  problem  is  how  to  exploit  such  a  parametric 
model  to  solve  the  composite  testing  problem.  The  assumption  that  the  model  order  P  is  known 
is  only  used  to  simplify  our  presentation.  In  practice,  the  model  order  has  to  be  estimated,  and  a 
variety  of  model  order  selection  techniques,  such  as  the  Akaike  Information  Criterion  (AIC)  and 
the  Minimum  Description  Length  (MDL)  based  techniques  (e.g.,  [35]  and  references  therein),  are 
available  for  this  task.  Since  such  techniques  may  over-  or  under-estimate  the  true  model  order, 
a  relevant  problem  is  how  the  proposed  detector  performs  when  over-  or  under-estimation  occurs 
(also  see  [36]).  This  will  be  investigated  in  Section  3.4.  Finally,  it  is  also  possible  to  formulate  the 
problem  to  include  P  as  another  parameter  to  be  estimated.  We  do  not  follow  such  an  approach 
in  order  to  focus  on  the  relations  between  the  parametric  Rao  test  and  the  PAMF  detector,  which 
also  assumes  that  a  prior  estimate  of  P  is  available. 


3.3  The  Parametric  Rao  Test 


3.3.1  Test  Statistic 


The  derivation  of  the  parametric  Rao  test  that  takes  into  account  Assumptions  AS1  to  AS4  in 
Section  7.2  is  presented  in  Appendix  A. 2,  which  in  turns  relies  on  the  ML  estimates  of  the 
nuisance  parameters  (i.e.,  parameters  associated  with  the  disturbance  signal)  that  are  obtained 
in  Appendix  A.l.  The  test  is  given  by2 


^Rao  ~ 


2-^n=P 


s//(n)Q_1s(n) 


Ho 


SS;  TRao; 


(3.12) 


where  7Rao  denotes  the  test  threshold,  which  can  be  set  by  using  the  results  in  Section  3.3.2, 
s (n)  and  x0(n)  denote,  respectively,  the  steering  vector  and  test  signal  that  have  been  whitened 
temporally,  and  additional  spatial  whitening  is  provided  by  Q_1,  which  is  the  inverse  of  the  ML 
estimate  of  the  spatial  covariance  matrix  to  be  specified  next. 

2Although  the  factor  of  2  on  the  test  statistic  can  be  absorbed  by  the  test  threshold,  it  is  retained  to  keep  the 
asymptotic  distribution  of  the  test  statistic  more  compact.  See  Section  3.3.2. 
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Specifically,  the  temporally  whitened  steering  vector  and  test  signal  in  (7.14)  are  obtained  as 
follows: 


p 

s (n)  =  s(n)  +  AH(p)s(n  —  p),  (3.13) 

p=  i 
p 

x0(n)  =  xo(n)  +  ^  AH(p)x0(n  -  p),  (3.14) 

p=i 

where  A H(p)  denotes  the  ML  estimate  of  the  AR  coefficient  matrix  A11  (//). 

To  present  the  ML  estimates  more  compactly,  let 

Ah=[Ah(1),  Ah(2),  •••,  AH(P)]  eCJXJP,  (3.15) 

which  contains  all  the  coefficient  matrices  involved  in  the  P-th  order  AR  model,  and 

Yk(n)  —  [xfc  (n  —  1),  x£(n-  2),  •••,  x^(n-P)jT, 

/.'  —  0. 1 ,  •  •  •  .  /\ , 

which  contains  the  regression  subvectors  formed  from  the  test  signal  x0  or  the  A  -th  training  signal 
Xfc.  We  first  compute  the  following  correlation  matrices: 


N- 1  K 

R,-,:  =  J^xfc(n)xf  (n), 

(3.17) 

n=P  k= 0 

N-l  K 

&yy  =  2Zyfc(n)y^(n)’ 

(3.18) 

n=P  k= 0 

N-l  K 

Ry.x  =  2Zyfc(n)Xf  (n). 

n=P  k= 0 

(3.19) 

Then,  the  ML  estimates  of  the  AR  coefficients  AH  and  the  spatial  covariance  matrix  Q  are  given 
by  (see  Appendix  A.l) 

A"  =  -R^R-1,  (3.20) 

Q  —  Yk  +  i)(iv  -  P)  fT"  -  ■  (3.21) 

Remark  2:  The  PAMF  detector  also  involves  estimating  the  AR  coefficients  A11  and  the  spa¬ 
tial  covariance  matrix  Q  [23].  Several  estimators  were  suggested,  including  the  Strand-Nuttall 
algorithm  and  the  least-squares  (LS)  estimators.  The  LS  estimator  was  observed  to  yield  better 
performance.  Our  ML  estimator  is  similar  to  the  LS  estimator  except  that  we  use  both  the  test 
and  training  signals  to  obtain  parameter  estimates,  whereas  the  latter  utilizes  only  the  training 
signals  for  parameter  estimation.  A  subscript  “P”  is  therefore  used  for  the  parameter  estimates  in 
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(2.15)  to  indicate  the  difference.  Note  that  with  the  ML  estimator,  it  is  possible  to  derive  param¬ 
eter  estimates  exclusively  from  the  test  signal,  thus  obviating  the  need  for  training.  This  could 
be  advantageous  especially  in  highly  heterogeneous  environments  where  it  is  difficult  to  obtain 
training  signals  that  are  i.i.d.  with  respect  to  the  disturbance  in  the  test  signal.  The  detection  per¬ 
formance  of  the  parametric  Rao  detector  in  the  absence  of  training  will  be  explored  elsewhere. 
We  would  like  to  point  out  that  our  approach  is  similar  to  Kelly’s  GLRT  [13],  which  also  employs 
both  the  test  and  training  signals  for  parameter  estimation.  However,  we  shall  stress  that  Kelly’s 
GLRT  does  not  exploit  the  multichannel  parametric  model  as  shown  in  (7.3)  and  (3.8). 

Remark  3:  By  comparing  the  parametric  Rao  test  statistic  (7.14)  and  the  PAMF  test  statistic 

(2.15) ,  we  can  quickly  see  that  if  both  detectors  use  the  ML  estimator  for  parameter  estimation, 
they  are  identical  except  for  a  scaling  factor  of  2.  Hence,  under  the  conditions  stated  in  Section 
7.2,  the  PAMF  detector  is  a  parametric  Rao  detector.  Since  the  parametric  Rao  test  is  asymptoti¬ 
cally  equivalent  to  the  parametric  GLRT 3,  the  PAMF  detector,  with  the  ML  parameter  estimates, 
is  also  an  asymptotic  parametric  GLRT.  As  we  shall  see  in  Section  3.3.2,  the  equivalence  offers 
additional  insights  into  the  performance  and  implementation  of  the  PAMF  detector. 

Remark  4:  It  should  be  noted  that  similar  to  other  STAP  detectors,  the  parametric  Rao  test  is 
adaptive  in  that  the  detector  is  data-dependent,  as  evident  in  (7.14)-(3.21),  which  is  in  contrast  to 
data-independent  detector  (e.g.,  a  correlator).  This  shall  not  be  confused  with  recursive  adaptive 
implementation.  Although  recursive  adaptive  implementation  of  the  parametric  Rao  test  would 
be  of  interest  in  a  real-time  system,  it  is  beyond  the  scope  of  the  current  section. 


3.3.2  Asymptotic  Analysis 


As  shown  in  Appendix  A. 3,  the  asymptotic  distribution  of  the  Rao/PAMF  test  statistic  is  given 
by 


7f 


Rao 


X?(A), 


under  H0, 
under  H1} 


(3.22) 


where  xl  denotes  the  central  Chi-squared  distribution  with  2  degrees  of  freedom  and  X2W  die 
non-central  Chi-squared  distribution  with  2  degrees  of  freedom  and  non-centrality  parameter  A: 


N- 1 

A  =  2 1 cr | 2  ^  s'ff(n)Q_1s(n), 

n=P 


(3.23) 


where  s (n)  is  the  temporally  whitened  steering  vector  given  by  (3.9).  Note  that  A  is  related  to  the 
SINR  at  the  output  of  the  temporal  whitening  filter.  Recall  that  a  xl  random  variable  is  equivalent 
to  an  exponential  random  variable  with  probability  density  function  (PDF)  given  by 


x  >  0. 


(3.24) 


3The  parametric  GLRT  is  different  from  Kelly’s  GLRT  in  that  the  former  takes  into  account  the  parametric  model 
in  Section  7.2,  while  the  latter  does  not. 
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The  PDF  of  X2W  is  given  by  [7] 


U'*(\)(x)  =  TexP 


-2^  +  A) 


to  ,  x  >  0, 


'X22(A)v-/  2 

where  Iq{u)  is  the  modified  Bessel  function  of  the  first  kind  and  zero-th  order  defined  by 

Io{u)  =  ~ 

71 


(3.25) 


00  (-u2) 

exp  (u cos 6)  d6  =  V  , 

h  <w>2 


(3.26) 


The  above  distributions  can  be  employed  to  set  the  Rao  test  threshold  for  a  given  probability 
of  false  alarm,  as  well  as  to  compute  the  detection  and  false  alarm  probabilities,  and  so  on.  For  a 
given  threshold,  the  probability  of  false  alarm  is  given  by 

/»oo 

Pf=  /  fx2(x)dx  =  exp 

d  7Rao 

which  can  easily  be  inverted  to  find  the  test  threshold  7Rao  for  a  given  Pf.  In  addition,  the  proba¬ 
bility  of  detection  is  given  by 

f°°  1 

Pd  =  x  exp 

7-TRao  ^ 

for  a  given  test  threshold  7Rao. 

Remark  5:  The  asymptotic  distribution  under  H0  is  independent  of  the  unknown  parameters. 
The  probability  of  false  alarm  in  (3.27)  depends  only  on  the  test  threshold,  which  is  a  design 
parameter.  It  is  evident  that  the  Rao/PAMF  test  asymptotically  achieves  CFAR. 

Remark  6:  The  above  analysis  holds  under  Assumptions  AS1  to  AS4  of  Section  7.2  with 
one  exception.  In  particular,  since  the  ML  parameter  estimates  are  asymptotically  Gaussian  ir¬ 
respective  of  the  distribution  of  the  observed  data,  the  above  analysis  still  holds  if  the  Gaussian 
assumption  in  AS3  is  dropped.  This  also  explains  why  it  has  been  observed  in  several  studies 
that  the  PAMF  detector  obtains  good  performance  even  with  non-Gaussian  observations  (see, 
e.g.,  [22]). 


'2  (x  + 


In  (  x/AoT)  dx 


(3.28) 


3.4  Numerical  Results 

In  the  following,  we  present  our  numerical  results  of  the  parametric  Rao/PAMF  detector  obtained 
by  computer  simulation  and  by  the  above  asymptotic  analysis.  In  addition,  the  performance 
of  the  MF  (8.3)  and  AMF  (8.5)  detectors,  which  can  be  computed  analytically,  is  included  for 
comparison.  For  easy  reference,  Appendix  A. 4  contains  a  brief  summary  of  relevant  results, 
which  have  been  used  to  compute  the  performance  of  the  two  detectors.  We  reiterate  that  the  MF 
detector  serves  as  a  baseline  only.  We  do  not  consider  Kelly’s  GLRT  since  a  detailed  comparison 
between  the  GLRT  and  AMF  detectors  can  be  found  in  [12].  In  the  following,  the  disturbance 
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signal  is  generated  as  a  multichannel  AR(2)  process  with  randomly  generated  AR  coefficients  A 
and  a  spatial  covariance  matrix  Q.  In  particular,  A  and  Q  are  selected  to  ensure  that  Q  is  a  valid 
covariance  matrix  and,  furthermore,  A  is  chosen  to  ensure  that  the  resulting  AR  process  is  stable. 
Once  A  and  Q  are  selected,  they  are  fixed  in  all  trials.  The  signal  vector  s  is  generated  as  in  (3.4) 
with  randomly  chosen  normalized  spatial  and  Doppler  frequencies.  The  SINR  is  defined  as 

SINR  =  |a|2sHR_1s,  (3.29) 

where  R  is  the  JN  x  JN  joint  space-time  covariance  matrix  of  the  disturbance  d,  which  can 
be  determined  once  A  and  Q  are  selected  (the  details  are  not  shown  for  simplicity).  To  numer¬ 
ically  set  the  threshold  for  the  parametric  Rao/PAMF  detector,  a  total  of  5  x  104  trials  are  run. 
Meanwhile,  to  determine  Pd  for  a  given  threshold,  a  total  of  104  trials  are  run  for  each  SINR. 

First,  we  consider  the  asymptotic  distribution  of  the  parametric  Rao/PAMF  test  statistic  ob¬ 
tained  in  Section  3.3.2.  Figure  8.1  depicts  the  quantile-quantile  plot  of  the  Rao/PAMF  test  statistic 
under  both  hypotheses  against  the  corresponding  asymptotic  distribution  when  J  =  4,  N  =  32, 
and  K  =  8,  a  case  with  limited  training.  It  is  seen  that  even  with  a  relatively  small  data  size,  the 
asymptotic  distribution  matches  well  the  sample  test  statistics,  with  only  some  minor  deviation 
at  the  tail  portion. 

Next,  we  examine  the  receiver  operating  characteristic  (ROC)  [7]  of  the  parametric  Rao/PAMF 
detector.  The  parameters  used  in  the  simulation  are  J  =  4,  N  =  32,  and  K  =  256.  Figure  8.2 
depicts  the  ROC  curves  for  the  parametric  Rao/PAMF  test  obtained  by  simulation  and  asymptotic 
analysis,  for  SINR  values  of  0,  5,  and  10  dB.  It  is  seen  that  the  simulation  results  match  those 
obtained  by  asymptotic  analysis. 

Figures  8.3  to  8.6  depict  the  probability  of  detection  Pd  versus  SINR  for  the  MF,  AMF,  and 
the  parametric  Rao/PAMF  detectors  under  various  conditions  that  are  specified  below  the  figures. 
In  particular,  Figures  8.3  and  8.5  correspond  to  the  case  with  adequate  training,  for  which  the 
RMB  rule  is  satisfied  (see  discussions  in  Section  2),  whereas  Figures  8.4  and  8.6  correspond  to 
the  case  with  limited  training,  for  which  the  AMF  detector  does  not  even  exist,  since  the  training 
size  K  =  8  is  too  small  to  meet  the  minimum  training  condition  (2.13).  An  examination  of  these 
figures  reveals  the  following: 

•  When  the  assumptions  of  Section  7.2  are  met,  the  asymptotic  analysis  provides  a  quite 
accurate  prediction  of  the  performance  of  the  parametric  Rao/PAMF  detectors.  The  gap 
between  the  asymptotic  and  simulated  results  is  seen  to  widen  as  K  and/or  N  decreases. 
But  even  for  the  most  challenging  case  with  K  =  8  and  N  =  16,  the  gap  is  about  0.5  dB, 
as  shown  in  Figure  8.6. 

•  The  parametric  Rao/PAMF  detector  is  very  close  to  the  optimum  MF  detector.  The  gap 
between  the  two  detectors  closes  with  increasing  K  and/or  N . 

•  The  parametric  Rao/PAMF  detector  outperforms  the  AMF  detector  by  2  to  3  dB  when  the 
RMB  rule  is  marginally  satisfied.  This  agrees  with  earlier  observations  made  in  [23]. 

So  far  we  have  assumed  that  the  model  order  P  of  the  multichannel  AR  process  is  known 
(cf.  Assumption  AS4).  As  mentioned  in  Remark  1  of  Section  7.2,  various  model  selection  tech- 
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niques  can  be  used  to  estimate  P,  and  it  is  not  unusual  for  these  techniques  to  under-  or  over¬ 
estimate  the  model  order  by  a  small  number  (relative  to  the  true  model  order  P)  [35,36].  Hence, 
it  would  be  of  interest  to  find  out  how  the  parametric  Rao/PAMF  detector  performs  when  an 
inaccurate  model  order  estimate  is  used.  This  is  shown  in  Figure  8.7,  where  the  performance 
of  the  Rao/PAMF  detector  using  the  true,  an  under-estimated,  and  an  over-estimated  model  or¬ 
der  is  depicted.  As  we  can  see,  using  an  inaccurate  model  order  estimate  degrades  the  detec¬ 
tion  performance,  but  the  degradation  is  not  significant,  especially  in  the  case  of  model  order 
over-estimation.  Over-estimation  is  a  more  robust  error  since  the  high-order  coefficients  can  be 
estimated  close  to  zero  (providing  that  the  size  of  the  signals  that  can  be  used  for  estimation  is 
large  enough).  The  above  behavior  of  the  parametric  Rao/PAMF  detector  is  typical  and  has  been 
consistently  observed  in  other  experiments  with  a  similar  setup.  Here,  we  only  considered  the 
case  where  the  model  order  is  incorrectly  estimated  by  one  unit.  A  larger  performance  variation 
is  expected  if  there  is  a  larger  estimation  error  for  P. 

Finally,  Figure  8.8  depicts  Pd  versus  SINR  for  the  parametric  Rao/PAMF  detector  when  J  = 
4,  Pf  =  0.01  and  N  varies  from  N  =  4  to  Ar  =  128.  It  is  seen  that  the  detection  performance 
increases  with  N. 


3.5  Conclusions 

We  have  developed  a  parametric  Rao  test  for  the  multichannel  adaptive  signal  detection  problem 
by  exploiting  a  multichannel  AR  model.  We  have  derived  the  ML  estimates  of  the  parameters 
involved  in  the  test.  The  parametric  Rao  test  is  an  asymptotic  parametric  GLRT,  and  the  asymp¬ 
totic  distributions  of  its  test  statistic  under  both  hypotheses  have  been  obtained  in  closed  form. 
We  have  shown  that  the  PAMF  test  statistic  has  a  form  identical  to  that  of  the  parametric  Rao 
test  statistic;  therefore,  the  PAMF  test  is  also  an  asymptotic  parametric  GLRT.  Computer  sim¬ 
ulations  show  that:  1)  our  asymptotic  analysis  provides  fairly  accurate  prediction  of  the  perfor¬ 
mance  of  the  parametric  Rao/PAMF  test;  2)  even  with  relatively  limited  training,  the  parametric 
Rao/PAMF  detector  is  quite  close  to  the  ideal  MF  detector;  3)  the  parametric  Rao/PAMF  detector 
outperforms  the  AMF  detector,  which  does  not  exploit  a  parametric  model;  and  finally  4)  the  per¬ 
formance  of  the  parametric  Rao/PAMF  detector  is  affected  by  inaccurate  model  order  estimation, 
but  the  resulting  performance  degradation  is  tolerable  when  the  model  order  estimation  error  is 
small. 

Our  asymptotic  analysis  of  the  parametric  Rao  detector  is  based  on  several  assumptions  stated 
in  Section  7.2,  including  that  the  disturbance  can  be  modeled  as  an  AR(P)  process  with  known 
model  order  P  and  that  the  training  signals  are  i.i.d.  When  these  assumptions  are  violated,  we 
expect  that  the  analysis  will  be  less  accurate,  but  may  be  still  informative  if  the  assumptions  are 
not  significantly  violated.  For  example,  we  have  noticed  in  simulation  that  when  the  disturbance 
is  an  MA  process,  the  test  threshold  obtained  by  analysis  assuming  an  AR  model  is  still  quite 
accurate.  One  possible  reason  is  that  AR  models  are  fairly  general  parametric  models,  and  under 
mild  conditions,  can  be  used  to  model  or  approximate  a  large  class  of  stationary  random  processes 
(e.g.,  an  MA  process  can  be  approximated  as  an  AR  process  with  a  high  enough  model  order) 
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QQ  plot 


X  Quantiles 


Figure  3.1:  The  quantile-quantile  plot  of  the  parametric  Rao/PAMF  test  statistic  and  its  asymp¬ 
totic  distribution  under  H0  (upper  plot)  and  Hi  (lower  plot),  respectively,  with  J  =  4,  N  =  32, 
and  K  =  8.  Specifically,  the  x-axis  shows  the  ordered  samples  of  the  parametric  Rao/PAMF  test 
statistic,  while  the  y-axis  shows  the  ordered  samples  of  the  asymptotic  distribution. 


[30].  Nevertheless,  there  is  a  need  to  find  out  how  accurate  our  analysis  is  in  real  systems  with 
real  data  when  the  assumptions  of  Section  7.2  may  not  all  be  met.  This  will  be  an  interesting 
future  effort. 
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Rao/PAMF  ROC  Curves 


Figure  3.2:  The  receiver  operating  characterisitcs  (ROC)  curves  of  the  parametric  Rao/PAMF 
detector  at  various  input  SINR  when  J  =  4,  N  =  32,  and  K  =  256. 


K=256,  N=32,  J=4,  P  =0.01 


Figure  3.3:  The  probability  of  detection  Pd  versus  the  input  SINR  when  Pf  =  0.01,  J  =  4, 
N  =  32,  and  K  =  256. 
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K=8,  N=32,  J=4,  P  =0.01 


Figure  3.4:  The  probability  of  detection  Pd  versus  the  input  SINR  when  Pf  =  0.01,  J  —  4, 
N  =  32,  and  K  =  8.  Note  that  the  AMF  detector  is  not  included  since  it  cannot  be  implemented 
for  such  a  small  K. 


K=128,  N=16,  J=4,  P=0.01 


Figure  3.5:  The  probability  of  detection  Pd  versus  the  input  SINR  when  Pf  =  0.01,  J  =  4, 
N  =  16,  and  K  =  128. 


22 


K=8,  N=16,  J=4,  P=0.01 


Figure  3.6:  The  probability  of  detection  Pd  versus  the  input  SINR  when  Pf  =  0.01,  J  =  4, 
IV  =  16,  and  K  =  8.  Note  that  the  AMF  detector  is  not  included  since  it  cannot  be  implemented 
for  such  a  small  K . 


K=256,  N=32,  J=4,  P  =0.01 


Figure  3.7:  The  probability  of  detection  Pd  versus  the  input  SINR  of  the  parametric  Rao/PAMF 
detector  when  the  model  order  of  the  multichannel  AR  process  used  for  computing  the  test  statis¬ 
tic  is  true  (P  =  2),  under-estimated  (assuming  P  =  1),  and  over-estimated  (assuming  P  =  3), 
along  with  Pf  =  0.01,  J  =  4,  N  =  32,  and  K  =  256. 
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3.8:  (a)  The  impact  of  the  pi 
01  and  J  =  4.  (b)  “Zoomed- 


Chapter  4 

Parametric  GLRT  for  Multichannel 
Adaptive  Signal  Detection 

4.1  Introduction 

In  this  chapter,  we  develop  a  parametric  GLRT.  It  is  natural  to  extend  the  results  of  [37, 38]  and 
consider  the  parametric  GLRT  for  several  reasons.  First,  as  shown  in  Section  4.2,  the  problem  of 
interest  is  a  two-sided  parameter  testing  problem  that  admits  no  uniformly  most  powerful  (UMP) 
solution  [7].  A  GLRT  approach  is  widely  used  in  such  cases  due  to  its  good  asymptotic  properties 
including  asymptotic  CFAR  and  consistency.  Second,  the  parametric  GLRT  may  yield  improved 
performance  than  the  parametric  Rao  detector,  especially  when  the  data  is  limited.  This  is  because 
the  latter  is  an  asymptotic  (large-sample)  parametric  GLRT  [7,  Appendix  6B].  Third,  all  Rao 
tests,  including  the  parametric  Rao  detector,  are  obtained  based  on  a  further  approximation  that 
is  valid  only  for  weak  signals  [7,  p.  238].  As  such,  the  parametric  Rao  detector  is  expected  to 
degrade  when  the  weak  signal  assumption  is  violated.  The  above  observations  motivate  us  to 
consider  the  parametric  GLRT,  in  hope  of  finding  a  better  solution  to  the  problem. 

The  parametric  GLRT  to  be  discussed  is  different  from  Kelly’s  GLRT  [13].  The  latter  does 
not  utilize  a  parametric  model  to  model  the  disturbance.  For  this  reason,  our  solution  is  referred 
to  as  the  parametric  GLRT.  Our  parametric  GLRT  is  also  different  from  the  GLRT  developed 
in  [24],  where  a  different  detection  problem  is  addressed  that  involves  unknown  non-linear  signal 
parameters  associated  with  the  signal  to  be  detected.  We  follow  the  direction  of  [9-13, 16,22,23] 
and  consider  a  detection  problem  whereby  the  signal  to  be  detected  is  known  up  to  an  unknown 
amplitude.  The  data  model  and  assumptions  for  this  problem  are  further  discussed  in  Section  4.2. 

The  parametric  GLRT  relies  on  maximum  likelihood  (ML)  parameter  estimation  for  both  the 
null  and  alternative  hypotheses.  The  null  hypothesis  estimation  problem  is  addressed  in  [37], 
where  the  ML  estimator  is  obtained  in  closed-form.  We  show  in  Section  4.3.1  that  the  ML  esti¬ 
mator  under  the  alternative  hypothesis  is  non-linear  and  requires  searches  on  a  two-dimensional 
parameter  space.  To  address  this  issue,  we  introduce  an  asymptotic  ML  (AML)  estimator  that 
is  considerably  simpler,  yielding  estimates  in  a  non-iterative  fashion,  and  asymptotically  coin- 


25 


cides  with  the  optimum  ML  estimator.  The  AML  estimator  is  related  to  an  iterative  alternating 
least-squares  (ALS)  estimator  developed  in  [39],  but  with  several  notable  distinctions  (see  Sec¬ 
tion  4.3.3).  The  Cramer- Rao  bound  (CRB)  for  the  estimation  problem  is  also  derived,  offering  a 
baseline  for  comparing  various  (unbiased)  estimators. 

To  examine  the  performance  of  the  parametric  GLRT,  we  consider  scenarios  with  very  limited 
or  even  no  training  signals.  The  less  challenging  case  with  more  training  is  extensively  consid¬ 
ered  in,  e.g.,  [23,37,38],  for  the  PAMF  and  parametric  Rao  detectors,  which  are  equivalent  to  the 
parametric  GLRT  with  a  large  amount  of  training.  It  should  be  noted  that  the  parametric  GLRT 
and  Rao  detectors  utilize  both  test  and  training  signals  for  parameter  estimation;  as  such,  they  are 
functional  even  without  training.  The  capability  to  handle  the  training-free  case  is  a  unique  and 
desirable  attribute  of  the  parametric  GLRT  and  Rao  detectors.  Although  the  performance  of  the 
parametric  GLRT  and  Rao  detector  degrades  in  the  absence  of  training,  such  degradation  can  be 
remedied  by  using  a  larger  N,  i.e.,  increasing  temporal  observations  of  the  test  signal.  We  show 
that  the  parametric  GLRT  outperforms  the  parametric  Rao  detector  when  N  is  small  and,  overall, 
the  former  yields  a  better  detection  performance. 

The  rest  of  the  chapter  is  organized  as  follows.  Section  4.2  contains  the  problem  statement. 
Parameter  estimation  is  addressed  in  Section  4.3,  including  the  ML  estimators  for  both  hypothe¬ 
ses,  the  AML  estimator  for  the  alternative  hypothesis,  and  the  CRB.  The  test  statistic,  implemen¬ 
tation,  and  asymptotic  analysis  for  the  parametric  GLRT  are  discussed  in  Section  4.4.  Numerical 
results  are  presented  in  Section  4.5,  followed  by  our  conclusions  in  Section  4.6. 


4.2  Data  Model  and  Problem  Statement 

In  this  section,  we  re-iterate  the  data  model  and  problem  statement  for  the  completeness  and 
easy  reference.  The  problem  of  interest  is  to  detect  a  known  multichannel  signal  with  unknown 
amplitude  in  the  presence  of  spatially  and  temporally  correlated  disturbance  (e.g.,  [1]): 

Ho  ■  xo(n)  =  d 0(n),  n  —  0, 1, —  ,N  —  1, 

Hi  :  x0(n)  =  cts(n)  +  d0(n),  n  —  0, 1, . . . ,  N  1, 

where  all  vectors  are  J  x  1  with  J  denoting  the  number  of  spatial  channels,  and  N  is  the  number 
of  temporal  observations.  In  the  sequel,  x0(n)  is  referred  to  as  the  test  signed,  s (n)  is  the  signal 
to  be  detected  with  amplitude  a,  and  d0(n)  is  the  disturbance  signed  that  may  be  correlated  in 
space  and  time.  In  addition  to  the  test  signal  x0(n),  there  may  be  a  set  of  target- free  tredning  or 
secondary  signals  xfc(n),  k  —  1,2, ... ,  K,  to  assist  in  the  signal  detection  process: 

Xfc(n)  =  dfc(n),  n  =  0, 1,. . .  ^N  -  1.  (4.2) 

In  radar  systems,  training  data  may  be  obtained  from  range  cells  adjacent  to  the  test  cell.  How¬ 
ever,  training  data  is  generally  limited  or  may  even  be  unavailable.  In  the  training-free  case,  we 
have  K  =  0. 
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Define  the  following  JN  x  1  space-time  vectors: 


s  =  [sT(0),  sT(l),  sr(N  —  1)]T , 

dfc  —  [dfc(O),  dl(l),  ...»  dl(N  -1)]T , 
xfc  —  [xfe(0)?  x£(l),  xl(N-l)]T, 

k  =  0, 1, ,  K. 

It  follows  that  (4.1)  can  be  more  compactly  written  as 

H0  :  x0  =  d0, 

H i  :  x0  =  as  +  d0. 


(4.3) 


(4.4) 


Clearly,  the  composite  hypothesis  testing  problem  (4.1)  or  (4.4)  is  a  two-sided  parameter  testing 
problem  that  tests  a  =  0  against  a^O.  The  above  signal  detection  problem  occurs  in  an  airborne 
STAP  radar  system  with  J  array  channels  and  a  coherent  processing  interval  (CPI)  of  N  pulse 
repetition  intervals  (PRIs).  The  disturbance  d/,.  consists  of  ground  clutter,  jamming,  and  thermal 
noise,  while  s  is  the  target  space-time  steering  vector  [23]. 

The  general  assumptions  in  the  literature  are  [1,9-13, 15, 16,22,23]: 


•  AS1:  The  signal  vector  s  is  deterministic  and  known  to  the  detector; 

•  AS2:  The  signal  amplitude  a  is  complex- valued,  deterministic,  and  unknown ; 

•  AS3:  The  disturbance  signals  (dfc}j[L0  are  independent  and  identically  distributed  (i.i.d.) 
with  distribution  CJ\f( 0,  R),  where  R  is  the  unknown  space-time  covariance  matrix. 

While  AS1  to  AS3  are  standard  [1,  9-13,  15,  22,  23]:  we  follow  a  parametric  approach  as  in 
[22,23,38,40]: 


•  AS4:  The  disturbance  signal  dfc(n),  k  =  0, 1, . . . ,  K,  can  be  modeled  as  a  J-channel 
AR(P)  process  with  known 1  model  order  P : 

p 

d k(n)  =  AH(p)dk(n  -  p)  +  £k(n),  (4.5) 

p=  i 

where  (A H(p)}p=1  denote  the  unknown  J  x  J  AR  coefficient  matrices,  £k(n)  denote 
the  driving  J-channel  spatial  noise  vectors  that  are  temporally  white  but  spatially  col¬ 
ored  Gaussian  noise:  £k(n)  ~  CAT(0,  Q),  where  Q  denotes  the  unknown  J  x  J  spatial 
covariance  matrix. 

'if  P  is  unknown,  it  can  be  estimated  using  a  variety  of  model  order  selection  techniques  (e.g.,  [35,  Appendix 
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The  problem  of  interest  is  to  develop  a  GLRT  based  on  Assumptions  AS1  to  AS4  for  the  above 
composite  hypothesis  testing  problem,  using  the  test  signal  x0  and  training  signals  {xfc}f=1  if  any. 
The  likelihood  functions  under  both  hypotheses  are  parameterized  by  the  signal  parameter  a  as 
well  as  nuisance  parameters  Q  and  A,  where 

A^=[Aff(l),  AH(2),  ...,  A H(P)]eCJXJP.  (4.6) 

For  simplicity,  we  write  the  likelihood  functions  as 

fi(a,  A,Q),  i.  —  0  or  1 ,  (4.7) 

where  a  =  0  under  H0  (i.e.,  i  —  0)  and  a  f  0  under  If  (i  =  1),  and  the  dependence  on  the 
test/training  signals  {x/,  }^  is  omitted.  While  the  test  statistic  of  the  GLRT  is  well  known,  which 
is  given  by  the  generalized  likelihood  ratio  (GLR)  [7]: 

nr  o  _  max«,A,Q  /l(a,  A,  Q)  „  ^ 

tjLR  ,  (4,») 

maxA.Q  /o(0,  A,  Q) 

finding  the  ML  estimates  of  the  unknown  parameters  is  non-trivial.  We  first  address  the  estima¬ 
tion  problem  before  examining  the  GLR  test  statistic  in  more  details. 


4.3  Parameter  Estimation 

Parameter  estimators  required  by  the  parametric  GLRT  as  well  as  the  CRB  are  developed  in 
Appendices  B.1-B.4.  The  main  results  are  summarized  below. 


4.3.1  ML  Estimation  under  Hi 


The  ML  estimate  of  a  under  Hi  is  given  by  (see  Appendix  B.l) 

I? (o  j  —  R,^,(a)R7i(a)Rj/a;(Q;) 


«ml  =  arg  nun 


yx\ 


yy 


where  the  correlation  matrices  conditioned  on  a  are  given  by 


N—l 

Rxx(a)  =  txo(n)  _  as(n)]  [x0(n)  -  as(n)}H 

n—P 

N-l  K 

+  J^J^xk(n)x^(n), 

n=P  k=l 
N-l 

Rw(«)  =  J]  [yoW  -  at (n)]  [y0(n)  -  at(n)}H 

n=P 

N-l  K 

n=P  k= 1 


(4.9) 


(4.10) 


(4.11) 
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(4.12) 


N-l 

Rp(a)  =  ^2  [y°(n)  ~  at(n)]  [xo (n)  -  as(n)]B 

n=P 

N-l  K 

+  5^5^yfcC«-)xf(n), 

n=P  k=  1 

with 


Yk(n)  =  [x^(n-  i),  x£(n-P)]T  (4.13) 

t(n)  =  [sT(n  —  1),  ...,  sr(n  — P)]T.  (4.14) 

Once  oiml  is  available,  the  ML  estimates  of  A  and  Q  under  H\  are  obtained  as 

Aml,i  =  AH(o!)|a=aML  (4.15) 

QmL,1  =  Q(«)|q=«ml)  (4.16) 

where  the  conditional  estimates  are  given  by 

A" (a)  =  -R"  (a)R„'(a),  (4.17) 

Q(a)  =  1  Rra(a)  -  R"  (a)R^(a)R„x(a)  ,  (418) 

with 

L  =  (K  +  1)(N  —  P).  (4.19) 


Remark  1:  Although  statistically  optimum,  cIml  has  no  closed-form  expression.  The  cost 
function  (4.9)  is  highly  non-linear.  A  brute-force  exhaustive  search  over  the  two-dimensional 
parameter  space  (i.e.,  the  real  and  imaginary  part  of  a)  is  generally  impractical.  Alternatively,  we 
can  resort  to  Newton-like  iterative  non-linear  searches,  provided  an  initial  estimate  of  a  is  avail¬ 
able.  Hence,  there  is  a  need  for  suboptimum  estimators  with  reduced  computational  complexity. 
One  such  suboptimum  estimator  is  discussed  in  Section  4.3.3. 


4.3.2  ML  Estimation  under  H{) 


This  is  a  special  case  of  the  one  addressed  in  Section  4.3.1  (see  Appendix  B.  1).  The  ML  estimates 
of  A  and  Q  under  H0  are  given  by 


Ah  - 


■r*'2/a:,0'r*'2/3/,0) 


-1 


Qml.o 


1 

L 


(4.20) 

(4.21) 


where  the  correlation  matrices  R^o,  R;/J/.o,  RyX:o  are  obtained  from  (4.10)-(4.12),  respectively, 
by  setting  a  =  0. 
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4.3.3  Asymptotic  ML  Estimation  under  H i 


We  now  introduce  a  computationally  more  efficient  estimator  that  is  asymptotically  equivalent 
to  the  ML  estimator.  The  estimator  is  henceforth  referred  to  as  the  AML  estimator.  The  idea  is 
to  replace  R^ajRyJ  (a)  in  the  cost  function  of  (4.9)  by  a  statistically  consistent  estimate  Acon 
(how  to  obtain  such  a  consistent  estimate  is  discussed  next).  The  resulting  cost  function  Ci(a), 
which  can  be  shown  to  be  asymptotically  equivalent  to  the  cost  function  of  (4.9)  (e.g.,  [31,41]), 
can  be  written  as 


Ci  (a) 


N- 1 

^  ^  jxo(n,  Acon)  os  (it,  Acon)| 

n=P 

|x0(n,  Acon)  as (n,  Acon)  ^ 

N-l  K 

+  ^  ^  ^  "  x^('ft,  Acon)xfe  (ft,  Acon)  , 

n=P  k=  1 


(4.22) 


where  xfc(n)  and  s (n)  are  the  temporally  whitened  versions  of  xfc(n)  and  s(n),  respectively,  using 
the  consistent  AR  coefficient  estimate  Acon: 

xfc(ft;  Acon)  =  xfc(ft)  +  A^nyfc('ft),  (4.23) 

s(n;  Acon)  =  s(ft)  +  A^nt (ft).  (4.24) 


Note  that  the  matrix  inside  the  determinant  of  (4.22)  is  a  quadratic  form  of  a.  An  asymptotic 
solution  is  obtained  in  Appendix  B.2,  which  is  given  by 

tr  (sH^_1X0) 

«AML  =  - 7^ - )  (4.25) 

tr  ( _1SJ 

where 

S  =  [s(P;  Acon),  ...,  s(N  —  1;  Acon)]  , 

Xfc  [xfc(Pj  Acon),  . . . ,  x^(iV  1,  Acon)]  , 

K 

^  =  X0P±X^  +  ^XfcXf, 

k= 1 

with  denoting  the  projection  matrix  projecting  to  the  orthogonal  complement  of  the  range  of 
SH: 

P±=1-P  =  I-&H  (4.29) 

The  above  AML  estimator  requires  a  consistent  estimate  Acon,  which  can  be  obtained  by 
using  a  consistent  estimate  of  a  in  (4.17).  One  such  estimate  is  obtained  by  the  least-squares 


(4.26) 

(4.27) 

(4.28) 
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(LS)  amplitude  estimator: 


«LS  = 


S^Xp 

S^S 


(4.30) 


which  ignores  the  fact  that  the  disturbance  signal  is  colored.  We  show  in  Appendix  B.3  that  «Ls 
is  statistically  unbiased  and  consistent. 

To  summarize,  the  AML  estimator  can  be  implemented  as  follows: 

•  Step  1:  Determine  a  consistent  estimate  Acon.  This  can  be  obtained  by  first  computing  the 
LS  amplitude  estimate  dLS  as  in  (4.30),  and  using  dLS  in  (4.17): 


—  -Rfx(dLs)Rj(dLs) 


yy 


(4.31) 


•  Step  2:  Compute  the  AML  amplitude  estimate  «aml  using  (4.25). 

•  Step  3:  Find  the  AML  estimates  of  the  AR  coefficients  and  spatial  covariance  matrix  by 
substituting  ctaml  for  a  in  (4.17)  and  (4.18),  respectively. 


Remark  2:  The  AML  estimator  is  obtained  based  on  multiple  approximations.  The  first 
involves  approximating  the  likelihood  function  by  dropping  out  the  initial  samples  of  the  AR 
process,  as  shown  in  Appendix  B.l.  The  second  approximates  the  cost  function  in  (4.9)  by 
C i(a)  as  in  (4.22),  which  replaces  A  with  a  consistent  estimate  Acon.  Ci(a)  is  further  shown 
to  be  equivalent  to  C2(a)  in  Appendix  B.2.  The  third  approximation  is  to  replace  the  nonlinear 
C2(a)  by  a  quadratic  C3(a)  in  Appendix  B.2,  which  admits  a  closed  form  solution.  All  three 
approximations  are  valid  in  the  large-sample  case. 

Remark  3:  The  above  AML  estimator  is  related  to  an  alternating  LS  (ALS)  estimator  dis¬ 
cussed  in  [39],  but  there  are  several  notable  differences.  First,  AML  covers  both  training  (K  ^  0) 
and  training-free  (K  =  0)  cases,  whereas  ALS,  which  was  introduced  to  solve  an  explosive  de¬ 
tection  problem,  considers  only  the  case  without  training.  Second,  ALS  is  an  iterative  approach, 
whereas  iteration  is  not  required  by  AML.  Finally,  by  using  an  asymptotic  approximation  of  the 
ML  cost  function,  AML  is  directly  related  to  the  ML  estimator  and  asymptotically  coincides  with 
the  latter.  Such  an  asymptotic  relation  was  not  established  for  ALS.  Numerical  results  in  Section 
4.5  indicate  that  AML  and  ML  yield  nearly  identical  estimation  performance. 


4.3.4  CRB 


From  Section  4.3.1,  an  amplitude  estimate  is  obtained  first  and  then  used  to  produce  the  nuisance 
parameter  estimates.  As  such,  amplitude  estimation  is  the  most  critical  step  in  the  estimation 
process.  Next,  we  provide  the  CRB  for  amplitude  estimation.  The  CRB  specifies  a  lower  bound 
on  the  variance  of  any  unbiased  amplitude  estimator,  thus  offering  a  baseline  for  comparison. 
The  CRB  for  a  is  derived  in  Appendix  B  .4,  which  is  given  by 


CRB  (a) 


^^s'ff(n;A)Q  1s(n;A) 

_n=P 


(4.32) 
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Like  s (n)  in  (4.24),  s (n;  A)  is  the  temporally  whitened  version  of  s (n),  but  by  using  the  true  AR 
coefficient  matrix  A  (the  dependence  on  A  is  explicitly  shown): 

p 

s (n;  A)  =  s (n)  +  ^  AH(p)s(n  —  p ).  (4.33) 

p= i 

The  CRB  for  the  nuisance  parameters  can  be  obtained  in  a  similar  fashion,  but  skipped  for  brevity. 


4.4  Parametric  GLRT 


4.4.1  Test  Statistic 

With  the  ML  parameter  estimates  obtained  in  Sections  4.3.1  and  4.3.2,  the  GLR  reduces  to 


GLR  = 


maxQ,A,Q  /iM,  Q) 
maxA,Q  /o(0.  A,  Q) 
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(4.34) 


where  Qml.o  and  QMl,i  are  given  by  (4.21)  and  (4.16),  respectively.  Equivalently,  taking  a  loga¬ 
rithm  (with  a  scaling  constant  2)  yields2 


?glrt  —  2 L  In 


Qml.o 


Qml.i 


h  i 

7glrt, 

H0 


(4.35) 


where  7glrt  denotes  the  test  threshold  (see  Section  4.4.2  for  discussion  on  the  setting  of  7glrt)- 
Remark  4:  The  final  test  statistic  is  a  ratio  of  two  matrix  determinants.  Note  that  the  two 
covariance  matrix  estimates  Qml.o  and  Qml.i  have  an  identical  form  given  by  (4.17),  except  that 
a  =  0  for  the  former  and  a  =  «ml  for  the  latter.  Hence,  once  6  Ml  is  obtained,  the  remaining 
steps  invovled  in  calculating  Qml.i  are  very  similar  to  those  needed  for  QMl,o>  which  can  be 
performed  by  the  same  computing  algorithm  or  hardware,  thus  simplifying  implementation. 

Remark  5:  Instead  of  using  the  non-linear  ML  amplitude  estimate  &ml,  we  can  employ  the 
computationally  more  efficient  AML  amplitude  estimate  ctaml  in  calculating  the  GLRT  test  statis¬ 
tic.  As  shown  in  Section  4.5,  the  two  different  versions  of  GLRT  offer  nearly  identical  detection 
performance. 

2Although  the  factor  of  ‘2L  and  logarithm  in  the  test  statistic  can  be  absorbed  by  the  test  threshold,  it  is  retained 
to  keep  the  asymptotic  distribution  more  compact.  See  Section  4.4.2. 
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4.4.2  Asymptotic  Analysis 


As  shown  in  Appedix  B.5,  the  asymptotic  distribution  of  the  parametric  GLRT  statistic  in  (7.6) 
is  given  by 


T( 


GLRT 


under  H0, 
under  Hi, 


(4.36) 


where  xl  denotes  the  central  Chi-squared  distribution  with  2  degrees  of  freedom  (i.e.,  exponential 
distribution)  and  y22(A)  the  non-central  Chi-squared  distribution  with  2  degrees  of  freedom  and 
non-centrality  parameter  A: 


N- 1 

A  =  2 1 or | 2  ^  s H(n;  A)Q_1s(n;  A). 

n=P 


(4.37) 


Note  that  A  is  related  to  the  signal-to-interference-plus-noise  ratio  (SINR)  at  the  output  of  the 
temporal  whitening  filter.  Using  the  above  result,  we  can  write  the  the  asymptotic  detection  and 
false  alarm  probabilities  as 


Pa 


Pf  =  exp 


--7glrt 


(4.38) 

(4.39) 


where  I0(u )  is  the  modified  Bessel  function  of  the  first  kind  and  zero-th  order  [7]. 

Remark  6:  The  asymptotic  distribution  under  H0  is  independent  of  the  unknown  parameters. 
The  probability  of  false  alarm  in  (4.39)  depends  only  on  the  test  threshold,  which  is  a  design 
parameter.  It  is  evident  that  the  parametric  GLRT  asymptotically  achieves  CFAR. 

Remark  7:  The  above  analysis  holds  under  Assumptions  AS1  to  AS4  of  Section  4.2  with 
one  exception.  In  particular,  since  the  ML  parameter  estimates  are  asymptotically  Gaussian  ir¬ 
respective  of  the  distribution  of  the  observed  data,  the  above  analysis  still  holds  if  the  Gaussian 
assumption  in  AS3  is  dropped. 


4.5  Numerical  Results 

In  this  section,  we  present  simulation  results  to  illustrate  the  performance  of  the  proposed  de¬ 
tection  and  estimation  techniques.  The  disturbance  signal  is  generated  as  a  multichannel  AR(2) 
process  with  AR  coefficients  A  and  a  spatial  covariance  matrix  Q.  These  parameters  are  set  to 
ensure  that  the  AR  process  is  stable  and  Q  is  a  valid  covariance  matrix,  but  otherwise  randomly 
selected.  The  signal  vector  s  corresponds  to  a  uniform  equi-spaced  linear  array  with  J  =  4 
antenna  elements  and  randomly  selected  normalized  spatial  and  Doppler  frequencies  (see  [23]). 
The  SINR  is  defined  as 

SINR  =  |a|2sHR-1s,  (4.40) 
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where  the  JN  x  JN  space-time  covariance  matrix  can  be  uniquely  determined  once  A  and  Q  are 
selected.  Note  that  the  above  SINR  can  be  considered  as  an  overall  SINR  that  takes  into  account 
all  spatial  and  temporal  signals  observed  within  one  CPI.  A  different  SINR  that  is  also  frequently 
used  is  defined  based  on  one  snapshot  of  the  array  output;  see,  e.g.,  [39]. 

4.5.1  Estimation 

The  estimation  results  are  presented  for  the  estimators  discussed  in  Section  4.3,  namely,  the  LS 
(4.30),  AML  (4.25),  and  ML  (4.9)  estimators.  The  ML  estimator  is  implemented  via  local  non¬ 
linear  iterative  searching,  initialized  by  the  AML  estimate.  We  first  consider  the  training-free 
case  with  K  =  0.  This  is  also  the  case  considered  by  the  ALS  estimator  [39]  and,  thus,  we 
include  it  for  comparison.  Figure  4.1  shows  the  mean-squared  error  (MSE)  of  the  amplitude 
estimate  a  obtained  by  each  estimator,  along  with  the  CRB  (4.32),  versus  the  SINR.  It  is  seen 
that  even  for  a  moderate  value  of  N  =  32,  the  AML  amplitude  estimate  is  nearly  identical  to 
the  ML  estimate,  and  both  are  very  close  to  the  CRB  and  considerably  better  than  the  simple  LS 
estimate.  It  is  also  observed  that  the  ALS  estimate  is  nearly  identical  to  the  AML  estimate  in  this 
case. 

Figure  4.2  depicts  the  results  for  a  limited-training  case  with  K  =  1.  The  LS  and  ALS 
estimators  are  not  included  since  they  do  not  utilize  any  training  signal  for  estimation.  It  is  seen 
that  both  the  AML  and  ML  estimates  are  nearly  identical  and  close  to  the  CRB  for  all  values  of 
SINR.  It  is  observed  that  use  of  training  data  slightly  improves  the  estimation  performance. 

4.5.2  Detection 

For  the  parametric  GFRT  (7.6),  the  test  statistic  can  be  computed  using  either  the  MF  or  AMF 
parameter  estimates,  as  indicated  in  Remark  5  of  Section  4.4.1.  The  resulting  tests,  which  are 
denoted  as  parametric  GLRT/ML  and  parametric  GLRT/AML,  respectively,  are  compared  with 
the  parametric  Rao  detector  [37,  38],  which  is  a  large-sample  approximation  of  the  parametric 
GFRT.  Also  included  in  the  comparison  are  the  asymptotic  analysis  for  the  parametric  GFRT 
given  in  Section  4.4.2,  the  ideal  matched  filter  (MF)  [12]  which  assumes  exact  knowledge  of 
R  and,  therefore,  cannot  be  used  in  practice  but  offers  a  baseline  for  comparison,  and  Kelly’s 
GFRT  [13]  which  is  included  to  show  the  gain  offered  by  parametric  detection.3  In  all  examples, 
we  set  J  =  4  and  the  probability  of  false  alarm  Pf  =  0.01. 

The  training-free  (K  =  0)  case  is  considered  in  Figures  4.3  to  4.5,  which  show  the  proba¬ 
bility  of  detection  versus  SINR  for  various  detectors  when  the  number  of  temporal  observations 
varies  from  N  =  32,  64,  to  128.  Meanwhile,  the  limited-training  ( K  =  1)  case  is  considered  in 
Figures  4.6  and  4.7  for  N  =  32  and  64,  respectively.  An  examination  of  these  figures  reveals  the 
following: 

3Recall  that  Kelly’s  GLRT  is  a  covariance  matrix  based  detector  that  cannot  handle  the  limited-training  or 
training-free  case.  In  the  following  examples,  we  use  either  K  =  0  or  K  =  1  for  the  parametric  detectors;  but 
for  Kelly’s  GLRT,  K  is  chosen  significantly  larger  so  that  K  >  JN  —  1  to  ensure  a  non-singular  estimate  of  R  (see 
discussions  in  Section  4.1). 
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•  The  parametric  GLRT/AML  yields  nearly  identical  detection  performance  to  that  of  the 
parametric  GLRT/ML,  and  may  be  preferred  to  the  latter  due  to  its  reduced  complexity. 

•  For  the  training-free  case,  the  parametric  GLRT  is  about  3  to  4  dB  from  the  optimum 
MF  bound  at  N  =  32;  the  gap  reduces  to  about  1  dB  at  N  =  64  and  a  fractional  dB  at 
N  =  128.  Training,  even  modest,  helps  improving  the  detection,  which  can  be  seen  by 
comparing  Figure  4.3  with  Figure  4.6,  or  Figure  4.4  with  Figure  4.7.  However,  the  degra¬ 
dation  incurred  by  lack  of  training  can  be  remedied  by  increasing  temporal  observations 
of  the  test  signal,  as  seen  in  Figures  4.3  to  4.5. 

•  For  small  N  (e.g.,  N  =  32),  the  parametric  GLRT  outperforms  the  parametric  Rao  detec¬ 
tor.  At  larger  values  of  N ,  the  two  detectors  exhibit  similar  performance,  especially  at  the 
low  SINR  region. 

•  The  parametric  Rao  detector  is  seen  to  degrade  dramatically  as  the  SINR  increases.  This 
is  not  surprising  since  all  Rao  tests,  including  the  parametric  Rao  detector,  are  based  on  a 
weak  signal  approximation  of  the  GLRT  [7,  Appendix  6B].  This  has  also  been  observed 
in  [42,  Fig.  3]  for  a  single-channel  detection  problem.  Such  degradation  may  not  be  critical 
in  applications  where  weak  signal  detection  is  of  primary  interest. 

•  Compared  to  Kelly’s  GLRT,  both  parametric  detectors  can  produce  better  detection  perfor¬ 
mance  with  significantly  less  training  or  even  no  training,  when  N  is  not  too  small. 


4.6  Conclusions 

We  have  developed  a  new  parametric  GLRT  for  multichannel  adaptive  signal  detection.  The 
parametric  GLRT  is  obtained  by  exploiting  multichannel  AR  modeling  for  the  disturbance  sig¬ 
nal.  We  have  investigated  the  underlying  parameter  estimation  problem.  The  ML  estimator  has 
been  derived,  but  not  in  a  closed  form.  An  AML  estimator  has  been  introduced  as  an  asymptot¬ 
ically  optimum  but  computationally  more  efficient  alternative.  We  have  examined  the  detection 
performance  of  the  parametric  GLRT  as  well  as  a  recently  proposed  parametric  Rao  detector.  We 
have  shown  that  while  both  parametric  detectors  are  significantly  less  dependent  on  training  than 
conventional  covariance  matrix  based  detectors,  the  parametric  GLRT  is  the  better  solution  of  the 
two,  especially  when  temporal  observations  of  the  test  signal  are  limited. 

One  most  interesting  feature  of  the  parametric  GLRT  and  Rao  detectors  is  that  both  use  the 
test  and  training  signals  for  parameter  estimation  and  can  handle  the  training-free  case.  We 
have  shown  that  the  performance  degradation  caused  by  the  lack  of  training  can  be  remedied  by 
increasing  the  temporal  observations  of  the  test  signal.  Such  a  tradeoff  may  be  of  interest  and  ex¬ 
ploited  in  some  applications,  such  as  radars,  when  the  environment  is  highly  heterogeneous  such 
that  using  neighboring  range  cells  for  training  becomes  impossible.  In  particular,  the  i.i.d.  as¬ 
sumption  AS3  concerning  the  test  cell  and  the  neighboring  range  cells  will  be  seriously  violated 
in  that  case. 
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Figure  4. 1 :  MSE  of  amplitude  estimate  a  versus  SINR  when  J  =  4,  N  =  32,  and  K  =  0  (no 
training  data). 


Figure  4.2:  MSE  of  amplitude  estimate  a  versus  SINR  when  J  =  4,  N  =  32,  and  K  =  1  (limited 
training  data). 
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K=0,  N=32,  J=4,  P  =0.01 


Figure  4.3:  Probability  of  detection  /d  versus  SINR  when  Pf  =  0.01,  J  =  A,  N  =  32,  and  K  =  0 
(no  training  data). 


K=0,  N=64,  J=4,  P  =0.01 


Figure  4.4:  Probability  of  detection  Pd  versus  SINR  when  Pf  =  0.01,  J  =  4,  N  =  64,  and  K  =  0 
(no  training  data). 
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Figure  4.5:  Probability  of  detection  Pd  versus  SINR  when  Pf  =  0.01,  J  =  4,  IV  =  128,  and 
K  =  0  (no  training  data). 


K=1,  N=32,  J=4,  P  =0.01 


Figure  4.6:  Probability  of  detection  Pd  versus  SINR  when  Pf  =  0.01,  J  =  4,  IV  =  32,  and  Jl  =  1 
(limited  training  data). 
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K=1,  N=64,  J=4,  P=0.01 


SINR  (dB) 


Figure  4.7:  Probability  of  detection  Pd  versus  SINR  when  Pf  =  0.01,  J  =  4,  N  =  64,  and  K  =  1 
(limited  training  data). 
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Chapter  5 

A  Simplified  Parametric  GLRT  for 
Multichannel  Adaptive  Signal  Detection 

5.1  Introduction 

Parametric  STAP  detectors  have  recently  gained  considerable  interest  due  to  their  remarkable 
ability  of  offering  significant  performance  improvement  over  classical  detectors  in  training  lim¬ 
ited  cases.  Specifically,  the  parametric  adaptive  matched  filter  (PAMF)  [22],  one  of  the  first  in 
this  class,  models  the  disturbance  signal  as  a  parametric  multichannel  autoregressive  (AR)  pro¬ 
cess.  The  parametric  model  allows  signal  whitening  through  an  inverse  moving-average  filter, 
which  replaces  the  standard  whitening  process  using  a  full-dimensional  space-time  covariance 
matrix  estimate  found  in  classical  STAP  detectors.  The  immediate  benefit  brought  by  the  para¬ 
metric  model  is  reduced  unknown  parameters  to  be  estimated  and,  in  turn,  reduced  training  and 
computational  requirements.  The  multichannel  AR  process  has  been  found  to  be  an  effective 
tool  to  model  real-world  airborne  radar  clutter  for  STAP  detection  [23,43,44].  It  is  also  ver¬ 
satile  in  capturing  the  temporal  and  spatial  correlation  of  disturbance  signals  in  other  radar  and 
array  processing  applications  (e.g.,  [24,25,39]).  The  PAMF  detector  is  shown  to  be  equivalent 
to  a  parametric  Rao  detector  in  [28].  The  equivalence  leads  to  analytical  expressions  for  the 
asymptotic  performance  of  the  PAMF  detector.  Efficient  implementations  of  the  PAMF  detector 
capitalizing  on  the  inherent  computational  structure  of  the  multichannel  AR  model  are  reported 
in  [45,46].  Meanwhile,  extensions  of  the  multichannel  AR  modeling  to  non- stationary  cases  for 
STAP  detection  are  investigated  in  [47-50]. 

The  parametric  generalized  likelihood  ratio  test  (GLRT)  [28]  is  a  recent  addition  into  the 
parametric  STAP  family.  An  interesting  observation  made  in  [28]  is  that  it  is  possible  to  trade 
range  training  with  the  number  of  pulses  within  a  coherent  processing  interval  (CPI).  Specifically, 
the  traditional  way  of  learning  the  clutter  statistic  is  to  use  the  signals  received  over  adjacent 
range  cells  near  the  test  range  as  the  training  signals,  assuming  that  the  target  is  a  rare  event 
and  that  the  clutter  statistic  does  not  change  much  over  the  neighborhood  of  the  cell  under  test. 
This  assumption  is  clearly  violated  in  heterogeneous  dense-target  environments,  which  is  why 
the  sample  covariance  matrix  based  detectors  do  not  perform  well  in  such  cases.  In  contrast,  [28] 
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shows  that  the  clutter  statistic  can  be  extracted  from  the  temporal  pulses  over  a  CPI;  in  the  extreme 
case,  this  can  be  achieved  exclusively  from  the  test  signal,  without  using  any  range  training, 
provided  that  the  number  of  pulses  is  large  enough.  The  performance  of  the  parametric  GLRT 
has  been  examined  using  simulated  and  real  data  in  various  training  limited  cases  [43,44]. 

There  are  still  critical  unresolved  issues  with  the  parametric  GLRT.  Specifically,  it  involves 
highly  nonlinear  parameter  estimation  that  has  no  closed-form  solution.  An  iterative  search  based 
procedure  is  employed  in  [28],  which  is  seen  to  be  computationally  intensive.  Moreover,  the 
iterative  searching  requires  an  initial  guess  of  the  unknown  parameter.  A  two-step  estimator  is 
presented  for  that  purpose,  which  starts  with  a  least-squares  (LS)  estimation  step  by  ignoring  the 
temporal/spatial  correlation,  followed  by  a  refining  step.  While  this  estimator  is  an  asymptotic 
maximum  likelihood  (AML)  estimator,  its  performance  is  limited  by  the  coarse  LS  estimator 
and,  as  we  show  in  Section  5.5,  may  not  perform  well  when  the  number  of  pulses  is  small. 
Finally,  the  parametric  GLRT,  due  to  its  complicated  nonlinear  form,  offers  little  insight  into  how 
it  functions.  This  is  different  from  other  parametric  STAP  detectors  (e.g.,  [23,40])  which  have  a 
clear  interpretation  of  sequential  temporal  and  spatial  whitening  (see  discussions  in  Section  5.4 
for  details). 

To  address  the  above  issues,  we  present  herein  a  new  estimator  for  the  estimation  problem 
underlying  the  parametric  GLRT.  The  new  estimator  is  in  closed-form  and  computationally  sim¬ 
ple.  Unlike  the  earlier  AML  estimator,  it  does  not  need  an  initial  guess  and,  thus,  is  not  hindered 
by  poor  initialization.  The  new  estimator  also  leads  to  a  simplified  parametric  GLRT,  offering 
additional  insight  unavailable  with  the  original  GLRT.  In  general,  the  new  GLRT  achieves  similar 
detection  performance  as  the  original  one.  But  in  the  more  challenging  case  when  the  number  of 
pulses  is  limited,  the  new  GLRT  may  outperform  the  original  GLRT  (which  employs  an  iterative 
search  based  estimation  procedure  initialized  by  the  AML  estimator).  The  performance  loss  of 
the  latter  is  primarily  due  to  the  poor  initial  parameter  estimate  provided  by  the  AML  estimator. 

The  remainder  of  this  chapter  is  organized  as  follows.  Section  5.2  contains  the  data  model  and 
a  summary  of  the  original  GLRT  of  [28],  where  an  underlying  nonlinear  amplitude  estimation 
problem  is  also  highlighted.  A  new  amplitude  estimator  is  introduced  in  Section  5.3,  which  leads 
to  a  simplified  parametric  GLRT  presented  in  Section  5.4.  Numerical  results  and  conclusions  are 
provided  in  Section  5.5  and  5.6,  respectively. 


5.2  Data  Model  and  Parametric  GLRT 


5.2.1  Data  Model 


The  problem  of  interest  is  to  detect  a  JN  x  1  multichannel  signal  s  with  unknown  amplitude  a 
in  the  presence  of  spatially  and  temporally  correlated  disturbance:  (e.g.,  [1]): 


H0  :  x0  =  d0, 

Hi  :  x0  =  as  +  d0, 


(5.1) 
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where  J  denotes  the  number  of  spatial  channels  and  N  the  number  of  temporal  observations  (i.e., 
snapshots).  It  will  be  convenient  to  express  the  JN  x  1  vectors  in  terms  of  their  spatial  J  x  1 
components,  i.e., 

s=[sT(0),  ...,  sT(iV-l)f,  (5.2) 

and  similarly  d0  and  x0  are  decomposed  into  d0(n)  and  x0(n),  respectively.  In  the  sequel,  x0(n) 
is  referred  to  as  the  test  signal,  s (n)  as  the  steering  vector  (assumed  known  to  the  detector),  and 
d0(n)  as  the  disturbance  signal  (i.e.,  clutter  and  noise)  that  may  be  correlated  in  space  and  time. 
In  addition  to  the  test  signal  x0(n),  there  may  be  a  set  of  training  or  secondary  signals  xfc(n), 
k  =  1 that  are  target-free:  xfc(n)  =  d fc(n). 

The  binary  composite  hypothesis  testing  problem  is  to  select  between  H0  :  a  =  0  and  //,  : 
a  7^  0.  A  standard  assumption  in  STAP  detection  (e.g.,  [1,9-13, 15, 16])  is  that  the  disturbance 
signals  {dfc}fL0  are  independent  and  identically  distributed  (i.i.d.)  with  distribution  CJ\f(0.  R), 
where  R  is  the  unknown  space-time  covariance  matrix.  The  parametric  STAP  detectors  [22, 23, 
28, 40]  further  assume  that  the  disturbance  signal  dfc(n),  k  =  0 can  be  modeled  as  a 
J-channel  AR(P)  process: 

dfc(n)  =  -  Ylf=i  A^(i)dfc(n  -  i)  +  ek(n),  (5.3) 

where  {Ai?(i)}^=1  denote  the  unknown  J  x  J  AR  coefficient  matrices,  efc(n)  denote  the  J  x  1 
spatial  noise  vectors  that  are  assumed  to  be  temporally  white  but  spatially  colored  Gaussian  noise: 
Skip)  ~  CMiO.  Q),  where  Q  denotes  the  unknown  J  x  J  spatial  covariance  matrix. 

5.2.2  Parametric  GLRT 

To  introduce  the  necessary  notation  and  also  to  facilitate  comparison,  we  briefly  summarize  the 
parametric  GLRT  [28]  as  follows.  The  parametric  GLRT  first  finds  the  ML  estimates  (MLEs)  of 
the  unknown  parameters  under  both  hypotheses,  which  are  next  used  to  compute  the  test  statistics. 
Amplitude  estimation  under  Hi  turns  out  to  be  the  key  problem,  as  the  other  parameters  can  be 
readily  obtained  once  an  estimate  of  a  is  available.  Specifically,  the  MLE  of  a  is  given  by 

dML  =  arg  nun  Rrx(«)  -  R^,(q:)Rw1  (a)Ryx(a)  ,  (5.4) 
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where  RT.r(a),  Rto(q;)  and  R^a)  are  J  x  J,  JP  x  JP  and  JP  x  J  matrices  defined  as 

K  N- 1 

=  X!  XMXk(n) 

k= 1 n=P 
N-l 

+  ^  [x0(n)  —  cts(n)]  [x0(n)  —  o;s(n)]  ,  (5.5) 

n=P 
K  N-l 

R-ro(a)  =  yfc(n)yf(n) 

/c=l  n=P 
N-l 

+  [y°(n)  _  at(n)]  [y0(n)  -  at (n)}H  ,  (5.6) 

n=P 
K  N-l 

=  Y^Y1  yfcWxf  H 

k= 1 n=P 
JV— 1 

+  [yo(w)  -  «t (ra)]  [x0(n)  -  as(n)]w  ,  (5.7) 

n=P 


and  the  regression  vectors  are  defined  as  t(n)  =  [sT(n  —  1),  . . . ,  s T(n  —  P)]T  G  CJPxl  and 

y k(n)  =  [xp(n  —  1),  . . . ,  xp(n  —  P)]T  G  CJPxl,  k  =  0, . . . ,  K .  Once  ctML  is  obtained,  the 
parametric  GLRT  is  given  by 


rp  orl  Qml.o  P 

i  GLRT  -  111  ,  7  7GLRT) 

Qml.i  ho 

(5.8) 

where1  L  =  (K  +  1  )(N  —  P),  7glrt  denotes  the  corresponding  test  threshold,  and  Qml.o  and 
Qml.i  denote  the  ML  estimates  of  the  spatial  covariance  matrix  under  the  null  and  alternative 
hypotheses 

Qml.o  —  Q(a)  a=o, 

(5.9) 

Qml.i  =  Q(a)|Q=o,ML, 

(5.10) 

with 

Q(«)  =  1  (&**(«)  -  Rfx(a)Rra1(a)Ry;c(a)J  . 

(5.11) 

The  MLE  (7.13)  is  highly  nonlinear  and  cannot  be  solved  in  closed-formed.  Iterative  search 
over  a  two-dimensional  (2D)  parameter  space  (note  that  a  is  complex- valued)  is  typically  em¬ 
ployed,  which  is  computationally  intensive  (as  the  matrix  determinant  has  to  be  evaluated  for 


'While  the  scaling  factor  L  can  be  dropped  from  the  test  statistic,  it  was  retained  in  [28]  to  simplify  the  asymptotic 
analysis. 
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every  update  of  a)  and,  in  general,  converges  only  to  a  local  minimum.  To  address  this  prob¬ 
lem,  an  asymptotic  ML  (AML)  estimator  was  introduced  in  [28].  The  AML  estimator  involves  a 
two-step  process.  In  particular,  it  first  computes  the  least-squares  (LS)  of  the  amplitude 


«LS  = 


sgx0 

S^S  ? 


(5.12) 


which  effectively  ignores  the  spatio-temporal  correlation  of  the  disturbance  signal.  Then,  the 
initial  estimate  is  refined  through  a  weighed  LS  process  (see  [28]  for  details).  Although  the  AML 
estimate  can  be  shown  to  be  asymptotically  efficient,  it  is  affected  by  the  limited  performance  of 
the  initializing  LS  estimator,  in  particular  when  the  data  size  is  small. 

In  closing  this  section,  we  briefly  comment  on  the  stability  issue.  In  general,  the  multichannel 
AR  process  used  to  model  the  disturbance  signal  has  to  be  stable  to  ensure  that  the  resulting  AR 
signal  is  wide-sense  stationary  [30].  A  constrained  ML  estimator  that  maximizes  the  likelihood 
function  under  the  constraint  that  the  AR  coefficient  matrices  form  a  stable  multichannel  filter  is 
highly  involved  and  generally  not  suitable  for  practical  applications.  In  contrast,  the  estimators 
considered  in  this  work,  including  the  ML  and  AML  estimators  as  well  as  the  one  introduced 
in  the  next  section,  do  not  impose  this  constraint  in  seek  of  computational  simplicity.  Although 
the  estimated  AR  model  obtained  by  any  of  these  estimators  is  not  guaranteed  to  be  stable,  ex¬ 
tensive  numerical  studies  using  simulated  and  experimental  data  show  that  these  unconstrained 
estimators  yield  good  estimation  and  detection  performance  at  acceptable  complexity. 


5.3  Amplitude  Estimation 


The  exact  MLE  (7.13)  minimizes  the  determinant  of  the  ^-dependent  matrix 

which  is  the  Schur  complement  (see,  e.g.,  [51])  of  the  block  matrix  R(a) 


(^)  ~^jyx  (^0 

R"  (a)  I!  (a)  .  ' 


(5.13) 


It  is  well-known  that  the  determinant  of  a  block  matrix  like  (5.13)  can  be  expressed  in  terms  of 
its  Schur  complement  [51]: 


R  (a) 


Rex  (a)  R/x  (°0  R/y  (^O  R/x  (°v 


R/y  (tt) 


Using  the  above  result,  the  cost  function  in  (7.13)  is  equivalent  to 


In 


R(a) 


In 


Ryy  (a) 


(5.14) 
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By  using  (7.10)-(7.12),  along  with  new  definitions  of  regression  vectors 


(5.15) 

(5.16) 


sp+i  (n)  =  [tT  (n) ,  sT  (n)] T  , 
xfc,p+i  (n)  =  [yl  (n) ,  xf  (n)]T  , 

R(a)  can  be  decomposed  into  an  a-dcpcndcnt  component  and  an  a-indcpcndcnt  one: 


I\ 

R  (a)  =  (X0  -  aS)  (X0  -  aS)^  +  ^  XfcXf ,  (5.17) 

k=  1 

where  the  new  steering  matrix  S  and  data  matrix  Xk  are  given  as 

S  =  [sp+i  (P) ,  •  •  •  ,sP+1(iV-l)]  eCJ(p+1)x(Af-p),  (5.18) 

Xfc  =  [xfc)P+1  (P) ,  •  •  •  ,  xfc>P+1  (N  —  1)]  j  k  —  0, 1,  •  •  •  ,  K.  (5.19) 

Similarly,  Ryy  (a)  can  be  decomposed  as 

K 

R yy  (a)  =  (Y0  -  «T)  (Y0  -  aT)H  +  ]T  YfcYf  „  (5.20) 

k= 1 

where 

T  =  [t  (P) ,  •  •  •  ,  t  (N  -  1)]  G  CJPX(JV"P),  (5.21) 

Yfc  =  [yfc(P),-"  ,yfc(iV-l)],fc  =  0,l,---  ,K.  (5.22) 


Using  (5.17)  and  (5.20),  an  asymptotically  equivalent  expression  for  (5.14)  is  derived  in  Ap¬ 
pendix  C.l: 

In  1  +  tr  |  (X0P5  -  aS)H  R  Yx  (X0P5  -  aS) } 

-In  [l  +  tr{(Y0Pr-a'T)PRr1(Y0Pr-Q'T)}  ,  (5.23) 

where 

K 

Rx  =X0P^X0p  +  ^  XfcXf ,  (5.24) 

k= 1 
K 

Ry  =Y0PpYf  +  ^  YfcYf,  (5.25) 

k=  1 

with  PY  denoting  the  projection  matrix  to  the  orthogonal  complement  of  the  range  of  S/; 

Pp  =  I-Ps  =  I-Sp(Sp)t,  (5.26) 
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where  (S11 ) '  denotes  the  Moore-Penrose  pseudo-inverse  of  SH,  while  the  other  projection  matrix 
P^;  is  similarly  defined  using  the  matrix  T. 

Based  on  the  asymptotic  expression  (5.23),  a  closed-form  estimate  of  the  amplitude  is  given 
by  (see  Appendix  C.2). 

tr  {  S^R^Xo  -  T^R^Yo) 

&  =  - i - i - y~.  (5.27) 

tr|SHR^1S-T*Ry1Tj 

Note  that  (5.27)  is  also  an  asymptotic  maximum  likelihood  (AML)  estimate,  since  the  underly¬ 
ing  approximations  (see  Appendices  C.l  and  C.2)  of  the  likelihood  function  were  made  in  the 
asymptotic  sense.  For  convenience,  the  AML  estimator  of  [7]  is  henceforth  referred  to  as  the 
AML1,  whereas  the  new  amplitude  estimate  (5.27)  as  the  AML2.  While  both  estimators  are 
AML,  it  should  be  noted  that,  unlike  AML1  that  involves  a  two-step  estimation  process  initial¬ 
ized  by  the  LS  estimator,  AML2  is  in  closed-form,  requiring  only  a  one-step  calculation.  As  we 
show  in  Section  5.5,  the  two  estimators  perform  similarly  when  the  data  size  is  large;  however, 
in  the  more  challenging  case  with  limited  data,  AML1  yields  a  notably  worse  performance  due 
to  the  coarse  initial  estimate  provided  by  the  LS  estimator. 


5.4  New  Parametric  GLRT 


Given  the  AML2  amplitude  estimate  (5.27),  the  spatial  covariance  matrix  estimates  (7.7)  and 
(7.8)  can  be  obtained  in  closed-form,  which  also  lead  to  a  closed-form  expression  of  the  para¬ 
metric  GLRT  test  statistic.  In  particular,  we  show  in  Appendix  C.3  that  the  test  statistic  (7.6)  can 
be  expressed  as 


N- 1  r 

E 

n=P 


~,H 
5p+ 1 


(n)  R^xq.p+i  (n)  -  tH  ( n )  RyVo  {n) 


N-l 


n=P  L 


E  SP+ 1  ( n )  R*  Sp+1  (n)  -  tH  (■ n )  Ry  t  (n 


(5.28) 


=GLR. 


To  gain  additional  insight  into  the  test  statistic  and  the  behavior  of  the  general  parametric  GLRT, 
it  is  shown  in  Appendix  C.4  that  the  test  statistic  can  be  equivalently  expressed  as 

1 2 


GLR  = 


N—l 

E  f 

n=P 


’P+1 


(n)  Wx 


0,P+1 


(n 


N-l 


(5.29) 


E  sp+ i  (n)  WsP+1  (n) 

n=P 

where  s p+1(n)  and  x0  P+l  (n)  are  J(P  +  1)  x  1  vectors  defined  in  (5.15)  and  (5.16),  and  W  is  a 
block  whitening  matrix 


W 


Wx  W2 

Wf  W3  ’ 


(5.30) 
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with  individual  component  matrix  given  by  (C.21)-(C.23)  of  Appendix  C.4. 

From  (5.29),  it  is  seen  that  the  parametric  GLRT  performs  a  partial  spatio-temporal  whitening 
across  J{P  +  1)  dimensions  (i.e.,  the  size  of  the  regression  vectors  x0ip+i  formed  from  the  test 
signal)  using  the  whitening  matrix  W.  Recall  that  a  fully  adaptive  STAP  detector  such  as  Kelly’s 
GLRT  [13]  performs  a  joint  spatio-temporal  whitening  across  all  JN  dimensions,  whereas  the 
parametric  Rao  or  PAMF  detector  performs  successive  (as  opposed  to  joint)  whitening,  i.e.,  tem¬ 
poral  whitening  followed  by  spatial  whitening  [23,40].  Hence,  the  parametric  GLRT  is  positioned 
between  the  two  cases.  This  allows  the  parametric  GLRT  to  utilize  a  parametric  model  and  pro¬ 
vide  data  efficiency  just  like  the  Rao,  meanwhile  exploiting  more  degrees  of  freedom  for  more 
effective  interference  rejection  and  detection.  This  corroborates  earlier  numerical  results  [28], 
which  shows  that  the  parametric  GLRT  in  general  outperforms  the  parametric  Rao  when  the  data 
available  for  estimation  becomes  very  limited. 


5.5  Numerical  Examples 

In  this  section,  several  simulation  results  are  provided  to  illustrate  the  performance  of  the  pro¬ 
posed  estimation  and  detection  techniques.  We  consider  simulated  data  generated  using  an  AR 
model  and  the  KASSPER  data  [52]  which  was  obtained  from  more  realistic  clutter  model.  For 
the  first  case,  the  disturbance  signal  is  generated  as  a  multichannel  AR(2)  process  with  AR  coef¬ 
ficient  A  and  spatial  covariance  matrix  Q;  these  parameters  are  set  to  ensure  that  the  AR  process 
is  stable  and  Q  is  a  valid  covariance  matrix,  but  otherwise  randomly  selected.  The  signal  vector  s 
corresponds  to  a  uniform  equispaced  linear  array  with  randomly  selected  normalized  spatial  and 
Doppler  frequencies.  The  signal-to-interference  plus  noise  ratio  (SINR)  is  defined  as 

SINR  =  |a|2sHR-1s.  (5.31) 


5.5.1  Estimation 

We  focus  here  on  the  challenging  case  with  zero  range  training,  i.e.,  K  =  0,  which  is  of  great 
interest  for  applications  in  heterogeneous  environments.  Under  this  setup,  we  consider  two  sub¬ 
cases  with:  (1)  N  =  32,  i.e.,  a  moderate  value  for  the  number  of  pulses  within  a  CPI;  and  (2) 
N  =  16  a  more  limited  scenario.  For  both  cases,  we  compare  the  LS  estimator  (5.12),  the 
AML1  of  [28],  the  AML2  (5.27),  and  the  ML  estimator  (7.13)  initialized  by  the  AML1  (this 
consideration  is  motivated  by  the  fact  that  AML1  provides  the  best  known  initial  estimate  prior 
to  the  current  work). 

Fig.  5.1  presents  the  mean-squared  error  (MSE)  of  the  amplitude  estimate  a  obtained  by  each 
estimator,  along  with  the  Cramer-Rao  bound  (CRB),  a  lower  bound  for  any  unbiased  estimator, 
versus  the  SINR  when  N  =  32  and  J  =  4.  It  is  seen  that,  in  this  case,  the  AML1  and  AML2 
amplitude  estimates  are  nearly  identical  to  the  ML  estimate,  while  the  LS  amplitude  estimate 
shows  the  worse  performance  among  all  estimators. 

The  results  for  the  case  of  N  =  16  is  shown  in  Fig.  5.2.  We  see  that  the  AML1  estimate  is 
worse  than  the  AML2  estimate  in  the  current  case,  whereas  the  ML  estimate  is  the  worst  due  to 
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inaccurate  initialization  and  local  convergence.  This  clearly  shows  the  limitation  of  the  iterative 
search  based  ML  estimator. 

5.5.2  Detection 

Here,  we  report  the  detection  performance  under  the  same  setup  as  in  Figs.  5.1  and  5.2.  We 
compare  the  various  parametric  GLRT,  including  GLRT/AML1  (i.e.,  the  GLRT  (7.6)  with  the 
AML1  estimator),  GLRT/ML  (the  GLRT  (7.6)  with  the  ML  estimator),  and  GLRT/AML2  (the 
GLRT  (5.28)  with  the  new  AML2  estimator).  Also  included  in  the  comparison  are  the  asymp¬ 
totic  result  provided  by  the  parametric  GLRT  (see  [28])  and  the  ideal  matched  filter  (MF)  which 
assumes  exact  knowledge  of  R  and,  therefore,  cannot  be  used  in  practice  but  offers  a  baseline  for 
comparison.  Here,  we  set  the  probability  of  false  alarm  Pf  =  0.01. 

Fig.  5.3  shows  the  probability  of  detection  Pd  versus  SINR  for  various  detectors  when  the 
number  of  temporal  samples  N  =  32  and  no  range  training  data  is  available.  It  is  seen  that 
the  GLRT/AML2  slightly  outperforms  the  GLRT/ML  and  GLRT/AML1,  but  overall  they  are 
quite  similar,  and  all  are  within  3  dB  from  the  ideal  MF  detector.  The  limited  sample  case  of 
N  =  16  is  depicted  in  Fig.  5.4,  where  the  GLRT/AML2  achieves  significantly  better  results  than 
the  GLRT/ML  and  GLRT/AML1.  The  poor  performance  of  the  GLRT/ML  is  due  to  the  poor 
amplitude  estimate  which,  as  shown  earlier  in  Fig.  5.2,  is  caused  by  inaccurate  initialization  and 
local  convergence. 

5.5.3  KASSPER  Dataset 

In  the  above  simulation,  the  disturbance  is  generated  by  an  AR  process  which  matches  the  as¬ 
sumed  model  of  the  parametric  detectors.  To  show  the  detection  performance  in  a  more  realistic 
environment,  we  use  the  KASSPER  dataset  which,  first,  is  not  generated  from  an  AR  model  and, 
in  addition,  contains  many  challenging  real-world  effects  including  heterogeneous  terrain,  array 
errors,  and  dense  ground  targets  (see  [52]  for  a  detailed  description  of  the  KASSPER  dataset). 

Fig.  5.5  shows  the  probability  of  detection  versus  SINR  in  the  training-free  K  =  0  case 
where  the  number  of  spatial  channels  is  J  =  11  and  the  number  of  temporal  samples  is  N  =  32. 
All  parametric  detectors  use  an  AR(1)  process  to  model  and  estimate  the  disturbance.  Results 
show  that  the  new  GLRT/AML2  generally  outperforms  the  GLRT/AML1  and  is  slightly  better 
than  the  GLRT/ML  at  high  SINR. 

The  parametric  GLRT  effectively  trades  range  training  for  temporal  pulses  within  a  CPI  and  if 
the  number  N  of  the  latter  is  large  relative  to  the  number  of  unknowns  to  be  estimated,  determined 
by  J  (number  of  spatial  channels)  and  P  (AR  model  order).  In  general,  for  low  order  AR  models, 
the  parametric  GLRT  can  provide  good  detection  performance  (e.g.,  within  3dB  from  the  MF 
bound)  if  N/J  >  5  [28].  This  is  not  the  case  for  Fig.  5.5,  where  N/J  3  and  we  see  a 
performance  gap  of  about  7  dB.  To  close  the  gap,  we  consider  the  case  when  the  parametric 
detectors  utilizes  K  =  1  range  training  signal  while  the  other  parameters  are  kept  the  same. 
There  are  two  guard  cells  between  the  test  cell  and  the  training  cell.  In  practical  radar  systems,  K 
is  usually  an  even  number  as  training  data  are  often  taken  from  both  sides  of  the  test  cell.  Here, 
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Figure  5.1:  MSEs  of  amplitude  estimate  a  versus  SINR  when  J  =  4,  N  =  32,  and  K  =  0 

we  choose  K  =  1  corresponding  to  a  more  restrictive  case.  The  results  are  depicted  in  Fig.  5.6. 
It  should  be  noted  that  in  the  KASSPER  data,  clutter  across  range  cells  is  not  i.i.d.  [52].  Still, 
a  small  amount  of  training  is  useful  to  the  parametric  detector  in  the  current  case,  all  yielding 
improved  detection  performance  less  than  3  dB  from  the  MF  bound.  This  is  due  to  the  fact  that 
the  effect  of  clutter  variation  across  a  small  area  (i.e.,  for  small  K)  is  negligible.  On  the  other 
hand,  for  a  data-demanding  non-parametric  covariance  matrix  based  STAP  detector,  K  has  to  be 
very  large,  in  which  case  the  effect  of  range-dependent  clutter  on  such  detectors  can  no  longer  be 
neglected  [52]. 


5.6  Conclusion 

A  new  parametric  GLRT  for  multichannel  adaptive  signal  detection  has  been  proposed.  The 
detector  builds  on  a  new  closed-form  solution  for  the  underlying  nonlinear  estimation  problem. 
The  new  parametric  GLRT  obviates  the  need  for  initial  parameter  estimation  as  required  by  an 
earlier  scheme,  is  computationally  simpler,  and  provides  generally  improved  detection  perfor¬ 
mance  when  training  data  is  limited.  Due  to  its  data  efficiency,  our  new  parametric  GLRT  and 
the  underlying  estimator  are  particularly  useful  for  detection  and  estimation  in  training  limited 
environments 
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Figure  5.2:  MSEs  of  amplitude  estimate  a  versus  SINR  when  J  =  4,  N  =  16,  and  K  =  0. 


K=0,  J=4,  N=32,  P  =0.01,  AR(2) 


Figure  5.3:  Probability  of  detection  Pd  versus  SINR  when  Pf  =  0.01,  J  =  4,  N  =  32,  and 
K  =  0. 
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K=0,  J=4,  N=16,  P=0.01,  AR(2) 


Figure  5.4:  Probability  of  detection  Pd  versus  SINR  when  Pf  =  0.01,  J  —  4,  N  —  16,  and 
K  =  0. 


K=0,  J=1 1 ,  N=32,  P  =0.01 ,  AR(1 ) 


Figure  5.5:  Probability  of  detection  Pd  versus  SINR  for  the  KASSPER  dataset  when  Pf  =  0.01, 
J  =  11,  N  =  32,  and  K  =  0. 
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K=1 ,  J=1 1 ,  N=32,  P  =0.01 ,  AR(1 ) 


Figure  5.6:  Probability  of  detection  Pd  versus  SINR  for  the  KASSPER  dataset  when  Pf  =  0.01, 
J  —  11,  N  —  32,  and  K  =  1. 
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Chapter  6 

Recursive  Parametric  Tests  for 
Multichannel  Adaptive  Signal  Detection 

6.1  Introduction 

The  parametric  Rao  and  parametric  GLRT  detectors  were  developed  by  assuming  that  the  model 
order  of  the  multichannel  AR  process  is  known  a  priori  to  the  detector.  In  practice,  the  model  or¬ 
der  has  to  be  estimated  by  some  model  order  selection  technique,  such  as  the  generalized  Akaike 
Information  Criterion  (GAIC),  Minimum  Description  Length  (MDL),  and  others  [31].  Since 
most  of  these  model  order  selection  techniques  require  estimates  of  the  unknown  parameters  for 
each  possible  model  order  before  the  best  one  is  identified,  a  standard  non-recursive  implemen¬ 
tation  of  the  parametric  detectors  is  computationally  intensive. 

In  this  chapter,  we  consider  joint  model  order  selection,  parameter  estimation,  and  target 
detection  for  STAP  applications.  We  note  that  the  parameter  estimates  of  a  multichannel  AR  pro¬ 
cess  for  all  model  orders  can  be  efficiently  obtained  by  recursively  solving  a  set  of  multichannel 
Yule -Walker  equations  using  the  multichannel  Levinson  algorithm  [29,30].  The  multichannel 
Levinson  algorithm  yields  the  parameter  estimates  for  a  particular  model  order  at  every  recur¬ 
sion,  following  which  information  criteria  such  as  the  GAIC  can  be  conveniently  computed.  As 
such,  the  estimation  of  the  model  order  is  naturally  integrated.  We  follow  the  above  approach  and 
develop  recursive  versions  of  the  parametric  Rao  and  parametric  GLRT  detectors.  The  recursive 
parametric  detectors  utilize  the  Yule -Walker  parameter  estimates  obtained  by  using  the  multi¬ 
channel  Levinson  algorithm  with  the  biased  autocorrelation  function  (ACF)  estimate  [29,30]. 
Our  development  of  the  recursive  versions  of  the  parametric  detectors  integrated  with  the  GAIC 
for  model  order  selection  is  well-motivated  since  the  multichannel  Levinson  algorithm  is  compu¬ 
tationally  efficient  and  the  model  order  is  not  required  to  be  known  to  the  detectors.  Numerical 
results  show  that  the  Yule -Walker  parameter  estimates  are  asymptotically  equivalent  to  the  ML 
estimates  originally  used  in  the  non-recursive  parametric  Rao  [37]  and  parametric  GLRT  [53] 
detectors.  It  is  also  observed  that  the  recursive  parametric  detectors  perform  nearly  identically 
to  the  corresponding  non-recursive  parametric  detectors,  even  though  the  formers  assume  no 
knowledge  of  the  model  order  while  the  latters  assume  the  exact  model  order. 
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The  rest  of  the  chapter  is  organized  as  follows.  Section  6.2  contains  the  data  model  and  prob¬ 
lem  statement.  The  non-recursive  parametric  Rao  and  parametric  GLRT  detectors  with  known 
model  order  are  summarized  in  Section  6.3.  Section  6.4  contains  our  recursive  parametric  Rao 
and  parametric  GLRT  detectors  with  unknown  model  order.  Numerical  results  are  presented  in 
Section  6.5,  followed  by  our  conclusions  in  Section  6.6. 


6.2  Data  Model  and  Problem  Statement 


Consider  the  problem  of  detecting  a  known  multichannel  signal  with  unknown  amplitude  in  the 
presence  of  spatially  and  temporally  colored  disturbance  (e.g.,  [1]): 


H0  :  x0  =  d0, 

Hi  :  x0  =  as  +  d0, 


(6.1) 


where  all  vectors  are  JN  x  1  vectors  with  J  denoting  the  number  of  spatial  channels  and  N 
the  number  of  temporal  observations.  The  test  signal  x0  contains  a  disturbance  signal  d0,  and 
possibly  a  target  signal  as,  where  s  denotes  the  target  steering  vector  which  is  assumed  known 
and  a  the  unknown  complex  amplitude.  In  addition  to  the  test  signal  x0,  there  may  be  a  set  of 
target-free  range  training  or  secondary  signals  x/,  =  d^  G  CJ,Vxl,  k  =  1, . . . ,  K,  that  can  be 
exploited  to  assist  in  the  target  detection  process.  In  this  chapter,  we  consider  both  cases  with 
or  without  training  data;  in  the  latter  case,  we  set  K  =  0.  The  disturbance  signals  (dfc}£_0  are 
assumed  to  be  independent  and  identically  distributed  (i.i.d.)  with  distribution  CJ\f(0,  R),  where 
R  g  cJNxJN  is  the  unknown  space-time  covariance  matrix. 

Let  us  decompose  the  JN  x  1  space-time  vector  x/,:  into  a  series  of  J  x  1  spatial  vectors 
xfc(n)  as  follows: 

Xfc=[x£(  0),  ...,  xj[(iV  —  1)]T  .  (6.2) 

Let  dfc  and  s  be  similarly  decomposed  into  dfc(n)  €  CJxl  and  s (n)  €  CJxl,  respectively.  Then, 
we  can  rewrite  the  hypothesis  testing  using  the  above  spatial  vectors  indexed  by  n  (time): 

H0  ■  xo(n)  =  d 0(n),  n  =  0, . . . ,  N  -  1, 

H\  :  x0(n)  =  as(n)  +  d0(n),  n  —  0, . . . ,  N  —  1. 

Furthermore,  we  follow  a  parametric  approach  as  in  [22,23,37,53],  which  models  the  disturbance 
signal  dfc(n),  as  a  J-channel  AR(P)  process  with  unknown  model  order  P : 

p 

d k(n)  =  -  ^  AH(i)dk(n  -  i)  +  ek(n),  k  =  0, 1, ... ,  K,  (6.4) 

i=  1 

where  (A//(i)}^=1  denote  the  unknown  J  x  J  AR  coefficient  matrices  and  £k{n)  the  J  x  1 
spatial  noise  vectors  that  are  temporally  white  but  spatially  colored:  ek(n)  ~  CJf( 0,  Q),  where 
Q  G  CJxJ  denotes  the  unknown  spatial  covariance  matrix. 
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The  problem  of  interest  is  to  develop  parametric  detectors  for  the  above  composite  hypothesis 
testing  problem  (6.1)  or  (6.3),  using  the  test  signal  x0  and  training  signals  {xfc}f=1  (if  any).  We 
reiterate  that  the  model  order  P  is  assumed  unknown  to  the  detector  in  this  chapter  whereas  the 
original  developments  of  the  PAMF,  parametric  Rao  and  parametric  GLRT  detectors  all  assume 
that  P  is  known  [22, 23, 37, 53].  A  distinctive  feature  of  this  work  is  that  we  consider  computa¬ 
tionally  efficient  solutions  to  this  joint  order  selection  problem,  parameter  estimation,  and  target 
detection  problem. 


6.3  Non-recursive  Parametric  Rao  and  Parametric  GLRT  De¬ 
tectors  with  Known  P 

For  easy  reference  and  to  facilitate  our  later  development  of  the  recursive  parametric  detectors, 
we  provide  a  brief  summary  of  the  parametric  Rao  and  GLRT  detectors  in  this  section.  These 
detectors  are  two  different  solutions  to  the  problem  stated  in  Section  6.2  when  the  model  order 
P  is  known  [37,53].  The  parametric  Rao  detector  is  computationally  simpler,  but  the  parametric 
GLRT  offers  improved  performance.  Both  detectors  first  find  the  ML  estimates  of  the  unknown 
parameters,  which  are  next  used  to  compute  the  test  statistics.  The  likelihood  functions  under  the 
null  and  alternative  hypotheses  are  parameterized  by  the  signal  amplitude  a,  the  AR  coefficients 
A«=[A«(1),  ....  A«(P)]  G  Cjx  JP,  and  spatial  covariance  matrix  Q.  Note  that  under  the 
null  hypothesis  we  have  a  =  0.  Given  A,  the  steering  vector  and  test  signal  can  be  temporally 
whitened  through  the  following  inverse  (i.e.,  moving  average)  filtering: 

p 

s (n)  =  s(n)  +  ^  AH(i)s(n  —  i), 

i=  1 
P 

x0(n)  =  x0(n)  +  ^2  AP(i)x0(n  -  i) 

1=1 

This  is  an  important  observation  exploited  by  the  parametric  Rao  and  parametric  GLRT  detectors 
that  are  summarized  next. 

The  parametric  GLRT  is  given  by  [53] 

rji  _ori  IQmuoI  *  is  n, 

Jglrt  —  2Lm  ^  7glrt,  (6.7) 

|Qml,i|  h0 

where  L  =  (K  +  1)  (TV  —  P )  and  7glrt  denotes  the  corresponding  test  threshold.  The  ML 
estimates  of  the  spatial  covariance  matrix  under  the  null  and  alternative  hypotheses,  Qml.o  and 
Qml.i  are  given  by 


(6.5) 

(6.6) 


(6.8) 
(6.9) 
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Qml,o  —  Q(a)|a=o, 
Qml.i  =  Q(Q!)|q=oml- 


The  a-dependent  Q(a)  is  given  by 

Q(a)  =  j  (^(a)  -  ,  (6-10) 

where  the  o-dcpcndcnt  correlation  matrices  are 

K  N—l 

Rm-(a)  =  XMXkH 

k= 1 n=P 
N—l 

+  [x0(n)  -  cts(n)]  [x0(n)  -  as(n)]H ,  (6.11) 

n=P 
K  N—l 

Rw(«)  =  y*(n)yk(n) 

k=l n=P 
N—l 

+  [y°(n)  _  at(n)]  lyo(w)  -  at (n)}H  ,  (6.12) 

n=P 
K  N-l 

R^-(a)  =  ^2  y*Wxf  W 

fc= 1 n=P 
N—l 

+  ^2  ty°(n)  -  at(n)]  [x0(n)  -  as(n)}H  ,  (6.13) 

n=P 


with  t  (n)  and  y  (n)  denoting  the  regression  sub  vectors  formed  from  the  steering  vector  s  (n)  and 
test  signal  x0(n),  respectively:  t(n)  =  [s7  (n  —  1),  . . . ,  s T(n  —  P)]T  e  CJPxl  and  y (n)  = 

[xq  (n  —  l),  . . . ,  Xq  (n  —  P)] T  G  CJPxl.  The  ML  estimate  of  a  under  the  alternative  hypothesis, 
which  is  used  in  (6.9),  is  given  by 


aML  =  arg  mm 

a 

The  parametric  Rao  test  is  given  by  [37] 


2~Rao  ~ 


En=pSH(w)  QuloXo(n 


1  SH, 


n 


m 

% 

H0 


^  TRao; 


(6.14) 


(6.15) 


where  7Rao  denotes  the  test  threshold.  The  temporally  whitened  steering  vector  s 77  (n)  and  test 
signal  xp(n)  are  obtained  by  replacing  Air  with  the  ML  estimates  under  H0 

A^0  =  -Rf,(a)R"y1(a)U=o,  (6.16) 


in  (6.5)  and  (6.6),  respectively. 
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The  Rao  test  is  asymptotically  equivalent  to  the  GLRT  but  may  be  inferior  to  the  latter  when 
the  data  size  is  small.  In  addition,  the  Rao  test  is  obtained  based  on  a  low  order  Taylor  expansion 
of  the  GLRT,  an  approximation  which  is  only  valid  for  weak  signals  [7] .  As  such,  the  performance 
of  the  parametric  Rao  detector  degrades  when  the  weak  signal  assumption  is  violated.  The  para¬ 
metric  GLRT  was  developed  as  an  improved  detector  to  deal  with  the  above  issues.  However,  the 
cost  function  of  the  ML  amplitude  estimator  in  (6.14)  is  highly  nonlinear.  Newton-like  iterative 
nonlinear  searches  are  generally  used  to  find  the  ML  amplitude  estimate.  Another  suboptimum 
but  computationally  more  efficient  estimator,  referred  to  as  the  asymptotic  ML  (AML)  estimator, 
was  developed  in  [53].  The  AML  estimator,  which  was  found  to  yield  similar  performance  to  the 
ML  estimator,  can  be  implemented  as  follows: 

•  Step  1  First,  compute  a  least-squares  (LS)  amplitude  estimate  dLS  =  Then,  determine 
an  estimate  A^  of  AH  as  follows: 

A-ls  =  R-^(o:ls)R'2)j/1(q:ls),  (6.17) 

which  can  be  shown  to  be  statistically  consistent  [53]. 

•  Step  2  Compute  the  temporally  whitened  signals  xfc(n)  and  s (n)  by  replacing  AH  with 
the  LS  AR  coefficient  estimate  in  (6.5)  and  (6.6),  respectively.  Then,  obtain  the  AML 
amplitude  estimate  q;aml  by  using 

tr  (S^-1^) 

^aml  = - 7^ - j  (6.18) 

tr  -1SJ 

where  S  =  [s(P),  ...,  i(JV-l)]  6CJxM  1=  [^(P),  ...,  ik(N  -  1)]  e 
C Jx(n-p)^  and 

K 

*  =  XoP^X^  +  ,  (6.19) 

k=  1 

with  denoting  the  projection  matrix  projecting  to  the  orthogonal  complement  of  the 
range  of  SH:  P±  =  I-P  =  l~SH  (sH^  e  &N~pWN~pl 

•  Step  3  Find  the  AML  estimate  of  the  spatial  covariance  matrix  by  substituting  q;aml  for  a 
in  (6.10). 

Recall  that  the  parametric  Rao  and  parametric  GLRT  detectors  utilize  both  the  test  and  train¬ 
ing  signals  for  the  parameter  estimation.  As  a  result,  they  are  functional  even  without  training 
data  [53].  The  capability  to  handle  the  training-free  detection  is  a  unique  and  desirable  attribute 
of  the  parametric  detectors  which  is  not  shared  by  other  existing  detectors  including  the  PAMF 
detector.  Nevertheless,  we  need  a  way  to  efficiently  find  an  accurate  estimate  of  the  model  order 
P. 
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6.4  Recursive  Parametric  Tests  with  Unknown  P 


A  standard  non-recursive  implementation  of  the  parametric  detectors  is  computationally  intensive 
since  the  parameter  estimation  for  the  underlying  parametric  model  has  to  be  repeated  for  all 
possible  model  orders  before  the  best  one  is  identified.  Therefore,  there  is  a  need  to  develop 
more  efficient  solutions  for  joint  model  order  selection,  parameter  estimation,  and  detection. 

We  present  herein  recursive  versions  of  the  parametric  Rao  and  parametric  GLRT  detectors. 
The  multichannel  Levinson  algorithm  is  used  to  recursively  solve  a  set  of  multichannel  Yule- 
Walker  equations  for  model  order  p  =  1,2, ,  Pmax,  where  Pmax  is  an  upper  bound  on  the 
model  order  P.  Interestingly,  the  complexity  involved  in  the  above  procedure,  which  provides 
parameter  estimates  for  all  Pmax  model  orders,  has  lower  complexity  than  that  involved  in  solving 
a  single  model  order  p  =  Pmax  by  the  ML  approach  (see  Section  6.4.5  for  details).  Given 
these  parameter  estimates  for  all  possible  p,  an  information  criterion  such  as  the  GAIC  can  be 
conveniently  utilized  to  identify  the  best  model  order  as  well  as  the  associated  estimates  of  A, 
Q,  and  a.  These  parameter  estimates  are  then  used  to  compute  the  final  test  statistics  for  the 
parametric  Rao  and  GLRT  detectors.  In  the  following,  we  discuss  the  details  of  the  proposed 
joint  approach. 

6.4.1  Parameter  Estimation  by  the  Multichannel  Levinson  Algorithm 

Assume  that  signal  x(n)  is  a  J-channel  AR(P)  process  as  described  in  (6.4).  Estimates  of  the 
unknown  parameters  can  be  obtained  by  solving  the  multichannel  Yule -Walker  equations  given 
by  [29,30] 

'  p 

—  ^  Aif(i)R(m  —  i),  m>  1, 

R(m)  =  <  (6.20) 

-  ^2  AH(i)R(-i)  +  Q.  m  —  0, 

<  i= 1 

where  the  autocorrelation  matrix  is  defined  as 

R  (m)  =  P[x(n)x^(n  —  m)].  (6.21) 

In  matrix  form,  the  multichannel  Yule-Walker  equations  become 

APn  =  [Q  0  ...  0]  ,  (6.22) 

where  the  block  matrix  A .p  contain  the  multichannel  AR  coefficients  and  7Z  is  a  block  Toeplitz 
matrix: 

AP=[I  Ap(l)  ...  Ap(P)]  ,  (6.23) 

R(0)  ■  ■  •  R(P) 

R(— 1)  R(P-l) 

R(— P)  R(0) 
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(6.24) 


The  multichannel  Levinson  algorithm  can  be  used  to  recursively  solve  the  above  multichannel 
Yule -Walker  equations  for  different  model  orders  as  follows  [29, 30]. 

The  multichannel  Levinson  algorithm  begins  with  the  following  initial  conditions: 


Qo  =  Qo  =  R(0),  (6.25) 

Ao  =  B0  =  I.  (6.26) 


Henceforth,  superscripts  /  and  b  denote  the  forward  and  backward  directions  of  a  linear  predic¬ 
tion  process  used  by  the  Levinson  algorithm,  subscript  denotes  the  order  of  the  linear  predictor, 
and  A  and  B  denote  the  block  row  matrices  formed  by  the  forward  and  backward  AR  coefficient 
matrices,  respectively. 

Given  the  p-th  order  forward  and  backward  AR  coefficient  matrices  Ap  and  Bp,  the  forward 
and  backward  reflection  coefficient  matrices  for  the  (p+ 1 )  -st  order  linear  predictors  are  computed 
by 

K„/S(p+1)  =  -Ap+1(QJ)-1,  (6.27) 

K“,(P+1)  =  -VP+1(Q/)-1.  (6.28) 

where  Ap+1  and  Vp+i  are  defined  as 

p 

Ap+1  =  Kf  (i)R(p  +  1-0,  (6.29) 

i= 0 
P 

Vp+1  =  Kf  (i)R(i  -  P  -  1).  (6.30) 

i= 0 

Next,  we  update  the  forward  and  backward  AR  coefficient  matrices  for  the  (p  +  l)-st  order 
predictors  as  follows: 

Ap+1  =  [Ap}  0]  +  K^(p  +  1)  [0,  Bp]  ,  (6.31) 

Bp+1  =  [0,  Bp]  +  K f+1(p  +  1)  [Ap,  0]  .  (6.32) 

Finally,  we  update  the  forward  and  backward  prediction  error  covariance  matrices  for  the  (p  + 1)- 
st  order  predictors: 

Q'n  -  Q,:  +  K ljV,,_,.  (6.33) 

Qj+1  =  Qp  +  Kj®,  (p  +  1)  Ap+1,  (6.34) 

which  completes  the  p-th  recursion  of  the  multichannel  Levinson  algorithm.  Note  that  the  solu¬ 
tions  to  the  p-th  order  multichannel  Yule-Walker  equations  are  Ap,  and  Qjr 

In  practice,  the  space-time  covariance  matrix  R(m)  in  the  multichannel  Yule -Walker  equa¬ 
tions  should  be  replaced  by  some  estimate.  The  biased  ACF  estimate  given  by 

^  N—l—m 

R(m)  =  —  ^  x(n  +  m)x.H(n),  (6.35) 

n= 0 

is  usually  recommended  since  it  guarantees  that  the  7 Z  is  nonnegative  definite  [29,30]. 
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6.4.2  AR  Model  Order  Selection 


Model  order  selection  for  parametric  models  is  a  classical  research  topic  and  has  been  investi¬ 
gated  for  various  models  (e.g.,  [30,31],  and  references  therein).  Herein,  we  consider  the  GAIC, 
which  has  been  observed  to  yield  good  performance  for  model  order  selection  (e.g.,  [54]).  The 
GAIC  chooses  the  model  order  p  that  minimizes 

W(p)  =  V(p)  +  r}(p),  (6.36) 

where  V  (p)  is  the  minimum  negative  log  likelihood  function  and  //(//)  is  a  penalty  term  that 
penalizes  increasing  model  order  [31].  The  minimum  negative  log  likelihood  function  can  be 
shown  to  be 

V(p)  =  J(K  +1)(N-  p )  ln(e7r)  +  (K  +  1)(N  -  p)  In  |Q|,  (6.37) 

where  the  dependence  on  p  is  made  explicit.  The  penalty  term  typically  takes  the  form  as  [31] 

r)(p)  =  2cJ2p  ln(ln(A”  +  l)(N-p)),  (6.38) 

where  c  >  1  is  a  parameter  of  user  choice.  It  has  been  found  that  (6.36)  along  with  (6.38)  usually 
provides  a  consistent  model  order  estimation  [31]. 

6.4.3  Recursive  Parametric  Rao  Test 

Based  on  the  above  recursive  parameter  estimation  and  model  order  selection  techniques,  the 
parametric  Rao  test  can  be  implemented  in  a  recursive  manner  as  follows: 

•  Step  1  Obtain  the  biased  ACF  estimate  according  to  (6.35): 

^  K  N—l—m 

R  (m)  =  — — —  >  >  xfc(n  +  m)xf(n), 

N(K  +  l)^o  ^  ’  ky  h  (6.39) 

171  0,  1,  ...  ,  Pn lax. 

Note  that  both  the  training  and  test  signals  are  used  to  obtain  the  ACF  estimate. 

•  Step  2  Initialization:  Set  p  =  0  and  initialize  the  forward  and  backward  prediction  error 
covariance  matrices,  Qq  and  Qq,  and  the  forward  and  backward  AR  coefficient  matrices, 
Aq  and  B0,  as  in  (6.25)  and  (6.26).  Compute  the  GAIC  W (0)  for  the  0-th  model  order  by 
using  (6.36). 

•  Step  3a  Compute  the  forward  and  backward  reflection  coefficient  matrices  for  the  (p+  l)-st 
order  linear  predictors,  K^,  (p  +  1)  and  (p  +  1),  by  using  (6.27)  and  (6.28). 

•  Step  3b  Update  the  forward  and  backward  AR  coefficient  matrices  for  the  ( p  +  l)-st  order 
predictors,  A*p+\  and  Bp+ 1,  by  using  (6.31)  and  (6.32).  Update  the  forward  and  backward 
prediction  error  covariance  matrices  for  the  (p  +  l)-st  order  predictors,  Q^+1  and  Qbp+V  by 
using  (6.33)  and  (6.34). 
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•  Step  3c  Compute  the  GAIC  W(p  +  1)  for  the  (p  +  l)-st  model  order,  by  using  (6.36). 

-  If  p  =  0,  increase  p  by  1  and  go  back  to  Step  3a; 

-  else  if  W(p  +  1)  >  W (p),  go  to  Step  4; 

-  otherwise,  increase  p  by  1  and  go  back  to  Step  3a. 


The  following  upper  bound  can  be  imposed  for  model  order  selection  [23] 


P  < 


3\/N 

J 


(6.40) 


•  Step  4  The  order  estimate  P  is  p  (i.e.,  the  final  value  of  the  above  recursion  index).  For  the 
selected  model  order  P  =  p,  obtain  the  parameter  estimates: 

AH(i)  =  Ap(i),  i  —  1,2, . . , ,  P,  (6.41) 

Q  =  Q  Sp.  (6.42) 

Compute  the  parametric  Rao  test  statistic  (6.15)  by  replacing  the  ML  parameter  estimates 
(6.16)  and  (6.8)  with  the  obtained  Yule- Walker  solutions  (6.41)  and  (6.42),  respectively. 
Finally,  the  test  statistic  is  compared  with  a  test  threshold  to  decide  if  the  target  is  present. 
The  test  threshold  can  be  determined  by  using  the  asymptotic  analysis  in  [37]. 


6.4.4  Recursive  Parametric  GLRT 

Recursive  implementation  of  the  parametric  GLRT  is  more  involved  than  the  recursive  parametric 
Rao  test.  The  reason  is  that  finding  the  ML  estimate  of  signal  amplitude  a,  which  is  required  by 
the  parametric  GLRT,  is  nonlinear  even  with  a  known  model  order  [53, 55].  To  circumvent  the 
problem,  we  consider  a  recursive  parametric  GLRT  by  using  the  model  order  estimate  obtained 
by  the  recursive  parametric  Rao  test. 

The  recursive  implementation  of  the  parametric  GLRT  can  be  summarized  as  follows: 

•  Step  1  Find  the  spatial  covariance  matrix  estimate  Q0  under  H0  and  the  model  order  esti¬ 
mate  P  by  using  the  multichannel  Levinson  algorithm  in  the  same  manner  as  in  the  recur¬ 
sive  parametric  Rao  test. 

•  Step  2  Using  the  model  order  estimate  P  obtained  in  Step  1,  find  the  amplitude  estimate 
a  by  either  (6.14)  or  (6.18).  Next,  obtain  the  spatial  covariance  matrix  estimate  Qi  by 
using  a  and  P.  Specifically,  the  spatial  covariance  matrix  estimate  Qi  can  be  obtained  by 
running  the  multichannel  Levinson  algorithm  a  second  time  (with  P  recursions)  along  with 
the  following  modified  ACF  estimate: 

Mm)  =  N(K  +  i)  |  x0(n  +  m)x^(n) 

I\  N—l—m 

+  Y1  xfc(n  +  m)xf(n) 

k= 1  n= 0 
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where  x0(n)  =  x0(n)  —  as(n). 

•  Step  3  Compute  the  test  statistic  (6.7)  by  replacing  the  ML  parameter  estimates  (6.8)  and 
(6.9)  with  the  Yule -Walker  solutions  Q0  and  Qi,  respectively.  Finally,  the  test  statistic  is 
compared  with  a  test  threshold  to  decide  if  the  target  is  present. 

6.4.5  Complexity 

We  provide  a  brief  discussion  on  the  complexity  involved  in  the  recursive  parametric  Rao  test 
versus  its  non-recursive  counterpart.  Since  the  recursive  and  non-recursive  implementations  dif¬ 
fer  only  in  parameter  estimation  (they  share  identical  steps  in  signal  whitening  and  calculating 
the  test  statistic),  we  only  compare  the  complexity  involved  in  finding  estimates  of  the  AR  coeffi¬ 
cients  A  and  the  spatial  covariance  matrix  Q.  Tables  6.1  and  6.2  contain  a  summary  of  the  num¬ 
ber  of  flops  involved  in  the  major  steps  of  the  recursive  and,  respectively,  non-recursive  parameter 
estimation.  In  general,  we  have  (. K  +  1 ) TV  >  JPmax  for  practical  applications.  Then,  it  can  be 
concluded  from  Tables  6.1  and  6.2  that  the  recursive  parameter  estimation,  which  yields  parame¬ 
ter  estimates  for  all  model  orders,  has  a  overall  complexity  of  Q(J2Pmax(K  +  1)  Ar),  whereas  the 
overall  complexity  of  the  non-recursive  estimation  for  is  0(J2P^ax(K  +  1)  Ar),  which  is  P2ax 
times  higher. 

Similar  conclusions  can  be  made  for  the  parametric  GLRT  since,  just  like  the  parametric  Rao 
test,  the  recursive  and  non-recursive  implementations  differ  only  in  how  parameter  estimates  are 
obtained. 


6.5  Numerical  Results 

In  this  section,  we  present  simulation  results  to  illustrate  the  performance  of  the  proposed  tech¬ 
niques.  The  disturbance  signal  is  generated  as  a  multichannel  AR(2)  process  (i.e.,  P  =  2)  with 
randomly  selected  AR  coefficients  A  and  a  spatial  covariance  matrix  Q.  These  parameters  are 
set  to  ensure  that  the  AR  process  is  stable  and  Q  is  a  valid  covariance  matrix,  but  otherwise 
randomly  selected.  The  steering  vector  s  corresponds  to  a  uniform  equi-spaced  linear  array  with 
J  =  4  and  randomly  selected  normalized  spatial  and  Doppler  frequencies  (see  [23]).  The  signal- 
to-interference-plus-noise  ration  (SINR)  is  defined  as 

SINR  =  |a|2sHR-1s,  (6.44) 

where  the  JN  x  JN  space-time  covariance  matrix  can  be  uniquely  determined  once  AH  and  Q 
are  selected. 
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6.5.1  Estimation 


We  first  examine  the  estimation  performance  of  the  solutions  to  the  multichannel  Yule -Walker 
equations.  Since  Q  is  a  matrix,  we  define  the  following  metric: 


f(Q)  =  J2  tr 


E 


H 


Q-Q  Q-Q 


(6.45) 


which  is  the  average  of  the  mean  squared  errors  (MSEs)  of  all  elements  of  the  matrix.  For  brevity, 
the  above  metric  is  referred  to  as  the  MSEs  henceforth. 

Figures  6.1  and  6.2  depict  the  MSE  of  the  spatial  covariance  matrix  estimate  Q  versus  the 
number  of  temporal  observations  N.  We  consider  Yule -Walker  estimate  obtained  by  using  the 
multichannel  Levinson  algorithm  with  the  corresponding  ML  estimate  (6.8).  Figure  6.1  shows  the 
case  without  training  data  ( K  =  0),  while  figure  6.2  corresponds  to  the  case  with  limited  training 
data  (K  =  2).  It  is  observed  that  the  Yule -Walker  estimate  is  asymptotically  (for  large  N  and/or 
large  K )  equivalent  to  the  ML  estimate,  while  the  performance  of  the  Yule- Walker  estimate  may 
be  different  when  the  data  size  is  small.  Figure  6. 1  shows  that  the  Yule- Walker  estimate  performs 
slightly  better  than  the  ML  estimate,  when  the  number  of  the  temporal  observations  N  is  small 
and  no  training  data  are  available  (K  =  0).  Although  it  is  generally  believed  that  the  ML  estimate 
is  more  accurate  than  the  Yule -Walker  estimate  (e.g.,  [56]),  with  an  extremely  small  data  size  as 
considered  in  this  example  (e.g.,  N  =  10,  K  =  0),  either  one  of  the  two  estimators  can  slightly 
outperform  the  other  depending  on  the  choice  of  the  AR  parameters.  It  should  also  be  noted 
that  the  bias  of  the  Yule-Walker  estimate  (because  of  the  use  of  the  biased  ACF  estimate)  can  be 
significant  [56].  Figure  6.2  shows  that  the  Yule -Walker  estimate  performs  nearly  identically  to 
the  ML  estimate  when  training  data  are  used  {K  =  2).  It  is  also  observed  that  the  Yule -Walker 
and  ML  estimates  improve  as  the  training  data  (K)  and/or  temporal  observations  ( N )  increases, 
i.e.,  the  data  size  increases. 


6.5.2  Detection 

We  next  examine  the  detection  performance  of  the  recursive  parametric  Rao  and  GLRT  detec¬ 
tors.  For  the  recursive  parametric  GLRT  detector,  the  AML,  instead  of  the  ML,  estimate  of  a 
is  used  since  the  detection  difference  between  the  two  is  negligible  (see  [53])  while  the  AML  is 
computationally  simpler.  Also  included  in  comparison  is  the  ideal  matched  filter  (MF),  which 
assumes  the  exact  knowledge  of  R  and,  therefore,  cannot  be  used  in  practice  but  offers  a  baseline 
for  comparison.  In  all  examples,  we  set  the  probability  of  false  alarm  Pf  =  0.01.  Recursive 
and  Non-Recursive  are  denoted  by  ‘R’  and  ‘NR’,  respectively,  in  the  figures.  For  example,  re¬ 
cursive  and  non-recursive  parametric  GLRT  detectors  are  denoted  by  R-GLRT  and  NR-GLRT, 
respectively. 

Figures  6.3  to  6.6  depict  the  probability  of  detection  of  various  detectors  versus  the  SINR 
for  the  recursive  parametric  detector  with  unknown  model  order  P  and  their  non-recursive  coun¬ 
terparts  with  known  P.  Figures  6.3  and  6.4  show  the  case  without  training  data  (K  =  0),  and 
Figures  6.5  and  6.6  correspond  to  the  case  with  limited  training  data  (K  =  2  and  8).  We  see 
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Equation 

Flops 

Remark 

(6.39) 

0(J2Pmax(K  +  l)N) 

one  time  calculation 

(6.27),  (6.28) 

0(J3(p  +  2)) 

at  pth  recursion 

(6.31),  (6.32) 

0(J3p) 

at  pth  recursion 

(6.33),  (6.34) 

o{J 3) 

at  pth  recursion 

Subtotal 

0(J3p) 

at  pth  recursion 

Total 

Of  +  1)JV)  + 

«  0( J2F»«(A  +  1)JV) 

forp  =  1,. . .  ,Pmax 

Table  6.1:  Complexity  of  the  Yule-Walker  estimator  with  the  multichannel  Levinson  algorithm 
for  model  orders  p  —  1, . . . ,  Pmax  (recursive  implementation) 


Equation 

Flops 

Remark 

(6.11) 

(6.12) 

(6.13) 

(6.10) 

0(J2(K  +  1)(N  -  p)) 
0(J2p2(K  +  l)(N  -  p)) 
0{J2p(K  +  1)(N  -  p)) 

0(J3(p3  +  p2  +  p)) 

for  model  order  p 
for  model  order  p 
for  model  order  p 
for  model  order  p 

Subtotal 

0(J2p2(K  +  1)N)  +  0(J3p3) 
«  0(J2p2(K  +  1)N) 

for  model  order  p 

Total 

0(J‘PLAK  +  W) 

for  p  =  1, . . . ,  Pmax 

Table  6.2:  Complexity  of  the  ML  estimator  for  model  orders  p  =  1, . . . ,  Pmax  (non-recursive 
implementation) 


that  in  general,  the  performance  of  the  recursive  parametric  detectors  with  unknown  P  is  nearly 
identical  to  that  of  their  non-recursive  counterpart  with  known  P.  This  is  particular  true  for  the 
cases  shown  in  Figures  6.4  to  6.6,  where  the  data  size  is  relatively  large  (large  N  with  K  =  0,  or 
a  moderate  N  with  non-zero  K).  This  is  probably  because  the  Yule -Walker  parameter  estimate 
is  slightly  more  accurate  than  the  ML  estimate  in  this  case  (see  Figure  6.1). 


6.6  Conclusions 

We  have  presented  recursive  versions  of  the  parametric  Rao  and  parametric  GLRT  detectors, 
utilizing  the  multichannel  Levinson  algorithm  to  solve  the  multichannel  Yule-Walker  equations 
recursively  and  find  the  estimates  of  the  unknown  parameters,  along  with  a  GAIC  for  model  order 
selection.  Numerical  results  show  that  the  Yule-Walker  estimate  obtained  by  using  the  multichan¬ 
nel  Levinson  algorithm  along  with  the  biased  ACF  estimate  is  asymptotically  equivalent  to  the 
ML  estimate  originally  used  in  the  non-recursive  parametric  Rao  and  parametric  GLRT  detectors. 
It  is  also  shown  that  the  proposed  recursive  parametric  detectors  that  assume  no  knowledge  about 
the  model  order  perform  nearly  identically  to  the  corresponding  non-recursive  parametric  detec¬ 
tors  with  perfect  knowledge  of  the  model  order,  while  the  formers  have  reduced  computational 
complexity. 
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Figure  6.1:  MSE  of  spatial  covariance  matrix  estimate  versus  the  number  of  temporal  observa 
tions  N  when  K  =  0  and  J  —  4. 


Figure  6.2:  MSE  of  spatial  covariance  matrix  estimate  versus  the  number  of  temporal  observa 
tions  N  when  K  =  2  and  J  =  4. 
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1 


K=0,  J=4,  N=64,  P  =1 .000000e-002,  AR(2) 


-5  0  5  10  15 

SINR  (dB) 


Figure  6.3:  Probability  of  detection  Pd  versus  SINR  when  K  =  0,  J  =  4,  N  =  64,  P  =  2,  and 
Pf  =  0.01. 


K=0,  J=4,  N=128,  Pf=1 .000000e-002,  AR(2) 


Figure  6.4:  Probability  of  detection  Pd  versus  SINR  when  K  =  0,  J  =  4,  iV  =  128,  P  =  2,  and 
Pf  =  0.01. 
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K=2,  J=4,  N=64,  P  =1 .000000e-002,  AR(2) 


Figure  6.5:  Probability  of  detection  Pd  versus  SINR  when  K  —  2,  J  —  4,  N  —  64,  P  =  2,  and 
Pf  =  0.01. 


K=8,  J=4,  N=32,  P  =1 .000000e-002,  AR(2) 


Figure  6.6:  Probability  of  detection  Pd  versus  SINR  when  K  —  8,  J  —  4,  N  —  32,  P  =  2,  and 
Pf  =  0.01. 
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Chapter  7 

Performance  Evaluation  of  Parametric 
Space-Time  Adaptive  Detectors 

7.1  Introduction 

Parametric  model  based  STAP  detectors  have  attracted  significant  interest  in  addressing  the  prob¬ 
lem  of  limited  training  [22-24,28,40].  Specifically,  the  parametric  adaptive  matched  filter 
(PAMF)  [23],  which  was  developed  by  exploiting  a  multichannel  AR  model  for  the  disturbance, 
significantly  outperforms  the  aforementioned  conventional  STAP  detectors  at  a  reduced  complex¬ 
ity  when  the  training  size  is  small.  Recently,  the  PAMF  detector  has  been  shown  to  be  equivalent 
to  a  parametric  Rao  detector  with  one  exception:  while  the  original  PAMF  uses  only  training 
signals  for  parameter  estimation,  the  parametric  Rao  detector  uses  both  training  and  test  signals 
for  estimation  [40].  Moreover,  another  parametric  detector,  referred  to  as  the  parametric  GLRT, 
offers  improved  detection  performance  over  the  parametric  Rao  test  when  data  is  limited  [28]. 
Computer  simulations  show  that  the  parametric  Rao  and  GLRT  detectors  work  well  with  limited 
or  even  no  range  training  data.  The  parametric  Rao  and  GLRT  detectors  utilize  both  test  and 
training  data  for  parameter  estimation;  in  the  absence  of  training,  the  parameters  are  estimated 
solely  from  the  test  signal  [28].  The  capability  to  handle  the  training-free  detection  problem  is  a 
unique  and  desirable  attribute  of  these  parametric  STAP  detectors,  making  them  good  candidates 
for  detection  in  heterogeneous  environments. 

To  facilitate  the  development  of  the  parametric  Rao  and  GLRT  detectors,  two  assumptions 
were  made,  which  may  not  hold  exactly  in  real  airborne  radar  systems  [19,21].  The  first  assump¬ 
tion  is  that  the  disturbance  follows  a  multichannel  AR  process.  The  second  assumption  is  that  the 
training  signals,  if  available,  are  assumed  to  be  independent  and  identically  distributed  (i.i.d.). 
In  practice,  the  training  data  is  subject  to  contamination  by  clutter  discretes  and/or  interfering 
targets,  in  which  case  the  training  data  becomes  heterogeneous.  It  is  noted  that  the  performance 
of  the  parametric  Rao  and  GLRT  detectors  was  evaluated  through  computer  simulations  [28,40] 
where  the  disturbance  signals  were  generated  to  meet  the  aforementioned  assumptions.  There¬ 
fore,  it  would  be  interesting  to  find  out  how  these  parametric  detectors  perform  in  real  radar 
environments  and  provide  insight  to  its  ability  to  real  application  in  airborne  radars. 
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In  this  chapter,  we  examine  the  detection  performance  of  the  parametric  Rao  and  GLRT 
detectors  using  three  more  realistic  datasets:  1)  the  Knowledge-Aided  Sensor  Signal  Processing 
and  Expert  Reasoning  (KASSPER)  dataset  that  contains  heterogeneous  data,  array  errors,  and 
dense  ground  targets;  2)  the  Multi-Channel  Airborne  Radar  Measurement  (MCARM)  dataset 
that  was  acquired  from  realistic  airborne  radar  experiments;  3)  the  dataset  for  a  bistatic  radar 
geometry  containing  range-dependent  clutter.  Experimental  results  show  that  the  parametric  Rao 
and  GLRT  tests  can  provide  good  detection  performance  with  limited  or  even  no  range  training 
in  more  challenging  environments.  Therefore,  these  detectors  offer  useful  solutions  to  detection 
problems  in  dense-target  or  heterogeneous  environments. 


7.2  Data  Model  and  Problem  Statement 

Consider  the  problem  of  detecting  a  known  multichannel  signal  with  unknown  amplitude  in  the 
presence  of  spatially  and  temporally  correlated  disturbance:  (e.g.,  [1]): 

H0  ■  xo(n)  =  d 0(n),  n  =  0, . . . ,  N  -  1, 

H\  :  x0(n)  =  as(n)  +  d0(n),  n  —  0, . . . ,  N  —  1, 

where  all  vectors  are  J  x  1  vectors,  J  denotes  the  number  of  spatial  channels,  and  N  is  the  number 
of  temporal  observations.  In  the  sequel,  x0(n)  is  referred  to  as  the  test  signal,  s (n)  is  the  signal 
to  be  detected  with  amplitude  a,  and  d0(n)  is  the  disturbance  signal  that  may  be  correlated  in 
space  and  time.  In  addition  to  the  test  signal  x0(n),  there  may  be  a  set  of  training  or  secondary 
signals  xfe(n),  k  —  1, . . . ,  K ,  that  are  target- free:  xfc(n)  =  d fc(n). 

In  particular,  we  consider  herein  the  signal  detection  problem  in  an  airborne  STAP  radar 
system  with  J  array  channels  and  a  coherent  processing  interval  (CPI)  of  N  pulses  repetition 
intervals  (PRIs).  The  disturbance  d fc(n),  k  =  0. . . . .  K  consists  of  ground  clutter,  jamming, 
and  thermal  noise.  Let  s  =  [sT(0),  . . . ,  sT(N  —  1)]T  .  Similarly,  dfc  and  xfc  are  formed  from 
dfc(n)  and  x/,(n),  respectively.  The  target  space-time  steering  vector  s  is  given  by  (assuming  a 
uniform  equi-distant  linear  array): 

s(ccs,ccd)  =  Ss(ccs)  <g>  s t(ud),  (7.2) 

where  ss(ccs)  and  st(u>d)  denote  the  J  x  1  spatial  steering  vector  and  the  Ar  x  1  temporal  steering 
vector,  respectively,  and  us  and  c od  denote  the  normalized  target  spatial  and  Doppler  frequencies, 
respectively. 

The  binary  composite  hypothesis  testing  problem  is  to  select  between  H0  :  a  =  0  and  //,  : 
a  7^  0.  Classical  STAP  detectors  (e.g.,  the  RMB,  Kelly’s  GLRT,  and  AMF  detectors)  were 
developed  based  on  the  assumption  that  [1,9-13, 15, 16,22,23]: 

•  AS1:  The  disturbance  signals  (dfc}fL0  are  i.i.d.  with  distribution  CM{ 0,  R),  where  R  is 
the  unknown  space-time  covariance  matrix. 

In  addition  to  the  above  general  assumption,  a  parametric  model  can  be  applied  to  characterize 
the  disturbance  [22,23]: 
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•  AS2:  The  disturbance  signal  d k(n),  k  —  0, . . . ,  K,  can  be  modeled  as  a  J-channel  AR(P) 
process  with  known  model  order  P : 

d k(n)  =  -  Yh=i  A^(i)d k(n  -  i)  +  £k(n),  (7.3) 

where  (A H(i)}[=1  denote  the  unknown  J  x  J  AR  coefficient  matrices,  £k(n)  denote  the 
J  x  1  spatial  noise  vectors  that  are  temporally  white  but  spatially  colored  Gaussian  noise: 
£k(n)  ~  CA/"(0,  Q),  where  Q  denotes  the  unknown  J  x  J  spatial  covariance  matrix. 

It  is  noted  that  assumption  AS1  is  often  violated  in  real  airborne  radar  systems  including 
dense-target  and  heterogeneous  environments,  which  offer  limited  or  even  no  range  training 
data  [19,21].  It  is  also  shown  that  the  performance  of  STAP  detectors  often  degrades  significantly 
in  a  heterogeneous  environment  where  assumption  AS1  is  violated  because  of  the  mismatch  of  the 
space-time  covariance  matrix  relative  to  that  of  the  target  test  cell.  Moreover,  the  disturbances  in 
real  radar  systems  do  not  follow  the  multichannel  AR  model  in  AS2,  whereas  the  parametric  Rao 
and  GLRT  detectors  assume  that  the  disturbance  follows  the  multichannel  AR  process.  Therefore, 
the  problem  of  interest  is  to  evaluate  the  performance  of  the  parametric  Rao  and  GLRT  detectors 
using  more  realistic  data,  i.e.,  the  KASSPER,  MCARM  and  Bistatic  datasets. 


7.3  Parametric  Rao  and  GLRT  Detectors 

For  easy  reference  and  to  facilitate  our  evaluation  of  the  parametric  detectors,  we  provide  a  brief 
summary  of  the  parametric  Rao  and  GLRT  detectors  in  this  section.  These  detectors  are  two  dif¬ 
ferent  solutions  to  the  problem  stated  in  Section  7.2  [28,37,40,53].  The  parametric  Rao  detector 
is  computationally  simpler,  but  the  parametric  GLRT  offers  improved  performance.  Both  detec¬ 
tors  first  find  the  ML  estimates  of  the  unknown  parameters,  which  are  next  used  to  compute  the 
test  statistics.  The  likelihood  functions  under  the  null  and  alternative  hypotheses  are  parameter¬ 
ized  by  the  signal  amplitude  a,  the  AR  coefficients  AH  =  [Ap(l),  . . . ,  A 11  (P)]  G  CJxJP, 
and  spatial  covariance  matrix  Q.  Note  that  under  the  null  hypothesis  we  have  a  =  0.  Given  A, 
the  steering  vector  and  test  signal  can  be  temporally  whitened  through  the  following  inverse  (i.e., 
moving  average)  filtering: 

p 

s (n)  =  s(n)  +  ^2  AJ/(i)s(n  —  i), 

i=  1 
P 

X0 (n)  =  x0(n)  +  ^  A p(i)x0(n  -  i) 

i= 1 

This  is  an  important  observation  exploited  by  the  parametric  Rao  and  parametric  GLRT  detectors 
that  are  summarized  next. 

The  parametric  GLRT  is  given  by  [28,53] 

r-p  _  o  r  i  I  Qml,o  I  Q 

Lglrt  —  2Lln  ^  7glrtj  (7.6) 

|Qml,i|  ho 
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(7.4) 

(7.5) 


where  L  =  (K  +  1  )(N  —  P )  and  7glrt  denotes  the  corresponding  test  threshold.  The  ML 
estimates  of  the  spatial  covariance  matrix  under  the  null  and  alternative  hypotheses,  Qml.o  and 
Qml.i  are  given  by 


Qml.o  —  Q(«)U=o,  (7-7) 

Qml.i  =  Q(«)|q=qMl-  (7-8) 


The  a -dependent  Q(a)  is  given  by 

Q(a)  =  ^  (R*x(“)  -  >  (7.9) 

where  the  a-dcpcndcnt  correlation  matrices  are 

K  N- 1 

x*(n)xk(n) 

k= 1 n=P 
N—l 

+  ^  [x0(n)  -  as(n)}  [x0(n)  -  as(n)]H ,  (7.10) 

n=P 
K  N—l 

Rw(«)  =  5Z  yk(n)y£(n) 

k= 1 n=P 
N—l 

+  5Z  [yo(n)  _  at(n)]  [y 0(n)  -  at(n)]H  ,  (7.11) 

n=P 
K  N-l 

=  y  k(n)x%(n) 

k= 1 n=P 
N—l 

+  51  [y°(n)  -  at(n)]  [xo(™)  -  ois(n)]H ,  (7.12) 

n=P 


with  t  (n)  and  y/,.(n)  denoting  the  regression  subvectors  formed  from  the  steering  vector  s (n)  and 
test  signal  xfc(n),  respectively:  t (n)  =  [sT(n  —  1),  . . . ,  s T(n  —  P)]T  G  CJPxl  and  y k(n)  = 

[xp(n  —  1),  •  •  • ,  xp(n  —  P)]T  G  CJPxl,  k  =  0, . . . ,  K.  The  ML  estimate  of  a  under  the 
alternative  hypothesis,  which  is  used  in  (7.8),  is  given  by 


a  ml  =  arg  mm 


fQ.X  (  ri  )  (  Cl )  (  d  )  fQ./'  ( ) 


The  parametric  Rao  test  is  given  by  [37,40] 


T^Rao  — 


QmL,o^ 


in 


Qml,0s 


1  ZH I 


n 


#0 


7Rao  • 


(7.13) 


(7.14) 
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where  7Rao  denotes  the  test  threshold.  The  temporally  whitened  steering  vector  s H  (n)  and  test 
signal  x^(n)  are  obtained  by  replacing  AH  with  its  ML  estimate  under  H0 

Aml,o  =  -Rfx(a)RraV)U=o,  (7.15) 

in  (7.4)  and  (7.5),  respectively. 

The  flowcharts  for  parametric  GLRT  and  Rao  detectors  are  shown  in  Figs.  (7.1)  and  (7.2). 
The  Rao  test  is  shown  to  be  asymptotically  equivalent  to  the  GLRT  but  may  be  inferior  to  the 
latter  when  the  data  size  is  small.  In  addition,  the  Rao  test  is  obtained  based  upon  a  low-order 
Taylor  expansion  of  the  GLRT,  an  approximation  which  is  only  valid  for  weak  signals  [7].  As 
such,  the  performance  of  the  parametric  Rao  detector  degrades  when  the  weak  signal  assumption 
is  violated.  The  parametric  GLRT  was  developed  as  an  improved  detector  to  deal  with  the  above 
issues.  However,  the  cost  function  of  the  ML  amplitude  estimator  in  (7.13)  is  highly  non-linear. 
Newton-like  iterative  non-linear  searches  are  generally  used  to  find  the  ML  amplitude  estimate. 
Another  sub-optimum  but  computationally  more  efficient  estimator,  referred  to  as  the  asymptotic 
ML  (AML)  estimator,  was  developed  in  [28,53].  The  AML  estimator,  which  was  found  to  yield 
similar  performance  to  the  ML  estimator,  can  be  implemented  as  follows: 

sHx0 

•  Step  1  First,  compute  a  least-squares  (LS)  amplitude  estimate  dLS  =  — - — .  Then,  deter- 

sMs 

mine  an  estimate  A^s  of  AH  as  follows: 

Als  =  ^(“ls)^1^),  (7.16) 

which  can  be  shown  to  be  statistically  consistent  [28,53]. 

•  Step  2  Compute  the  temporally  whitened  signals  Xfc(n)  and  s  (n)  by  replacing  AH  with 
the  LS  AR  coefficient  estimate  in  (7.4)  and  (7.5),  respectively.  Then,  obtain  the  AML 
amplitude  estimate  Qami.  by  using 

tr  (S^^X 0) 

A\ml  =  - 7^ - j  (7.17) 

tr  (S^^SJ 

where  S  =  [§(P),  ...,  i(N  -  1)]  e  CJ^N~p\±k  =  [i*(P),  ...,  ik(N  -  1)]  e 
Cjx(n-p),  and 

K 

^  =  X0P±X^  +  ^X,Xf,  (7.18) 

k=  1 

with  P-1  denoting  the  projection  matrix  projecting  to  the  orthogonal  complement  of  the 
range  of  SH:  P±  =  I  -  P  =  I  -  SH  (sH^  G 

•  Step  3  Find  the  AML  estimate  of  the  spatial  covariance  matrix  by  substituting  cIaml  for  a 
in  (7.9). 
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Recall  that  the  parametric  Rao  and  parametric  GLRT  detectors  utilize  both  the  test  and  train¬ 
ing  signals  for  the  parameter  estimation.  As  a  result,  they  are  functional  even  without  range 
training  data  [28,53].  The  capability  to  handle  the  training-free  detection  is  a  unique  and  desir¬ 
able  attribute  of  the  parametric  detectors  which  is  not  shared  by  other  existing  detectors  including 
the  PAMF  detector. 


7.4  Performance  Evaluation 

This  section  is  to  test  the  parametric  detectors  using  the  KASSPER  [52],  MCARM  and  bistatic 
datasets.  The  performance  evaluation  of  the  parametric  detectors  with  both  datasets  allows  us  to 
assess  the  influence  from  the  mismatch  between  AS2  and  real  disturbances. 

7.4.1  Simulated  KASSPER  Dataset 

The  KASSPER  2002  dataset  contains  many  real-world  effects  including  heterogeneous  terrain, 
subspace  leakage,  array  errors,  and  many  ground  targets.  The  simulated  airborne  radar  was  flying 
at  3000  m  altitude  at  100  m/s  traveling  due  east  with  a  3°  crab  angle.  The  radar  was  operating 
at  1240  MHz  with  a  peak  power  of  15  kW.  The  11  (virtual)  antenna  array  elements  were  spaced 
slightly  less  than  a  half-wavelength  apart  at  0.1092  m  (0.9028  half- wavelength  spacing),  and 
the  transmit  array  is  uniformly  weighted  in  the  horizontal  dimension  and  phased  to  steer  the 
mainbeam  to  195°.  The  pulse  repetition  frequency  (PRF)  was  1984  Hz  and  the  CPI  contains  32 
pulses. 

Dense  Targets  Environment 

The  KASSPER  dataset  simulates  dense  targets  environment.  Of  particular  interest  are  the  targets 
in  the  mainbeam  of  the  radar  within  the  range  swath  of  interest.  In  total,  there  are  268  targets 
from  35  to  50  km. 

We  apply  the  parametric  detectors  to  the  KASSPER  data  of  interest  and  compute  the  test 
statistics  with  respect  to  Doppler  frequency  and  range  cells.  To  compare  with  other  techniques, 
we  count  the  number  of  detection  while  setting  the  threshold  with  constraint  on  the  number 
of  false  alarm.  Due  to  range/Doppler  sidelobes  resulting  from  pulse  compression  and  Doppler 
filtering,  it  is  common  for  a  target  to  spread  into  nearby  range-Doppler  cells.  For  this  reason,  it  is 
standard  procedure  for  a  radar  to  cluster  target  detections  such  that  a  detection  in  a  given  range- 
Doppler  cell  is  associated  with  a  target  that  lies  in  a  contiguous  range  cell  or  Doppler  band.  For 
the  results  reported  here,  we  cluster  ±1  cells  in  range  and  ±1  bands  in  Doppler  for  all  processing 
schemes.  Once  we  declare  a  detection  in  its  region,  corresponding  test  value  is  removed  from 
original  test  statistics  to  avoid  over-counting  detection. 

For  comparison  purpose,  the  Joint  Domain  Localized  (JDL)  technique  with  3x3  local  pro¬ 
cessing  region  (LPR)  is  applied  to  the  KASSPER  dataset.  There  are  64  Doppler  frequency  bands 
and  two  guard  cells  are  used  at  each  side  of  the  test  range  cell.  The  number  of  false  alarm  Nfa 
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Table  7.1:  Number  of  Detected  Targets  when  the  number  of  false  alarm  Nfa  =  10  for  J  —  11 
without  training  limiting  and  with  IPS  training  limiting  (shown  in  the  parenthesis). 


Parametric  GLRT 

Parametric  Rao 

JDL 

K  =  11,  P  =  1 

37  (72) 

33  (59) 

19(35) 

K  =  11,  P  =  2 

44  (67) 

31  (46) 

19(35) 

K  =  22,  P  =  1 

42  (73) 

43  (67) 

20  (38) 

K  =  22,  P  =  2 

49(101) 

41  (76) 

20  (38) 

is  constrained  to  10.  For  the  parametric  detectors  and  JDL,  two  cases  are  considered:  without 
training  limiting  and  with  IPS  training  limiting. 

No  training  limiting:  Table  7.1  lists  the  number  of  detection  when  the  number  of  false  alarm 
Nfa  =  10  for  J  =  11,  where  true  targets  are  represented  by  cross  signs  and  detections  by 
bars.  From  this  table,  having  more  training  data  does  not  improve  detection  performance  without 
training  selection.  The  clairvoyant  detector  with  known  covariance  matrices  can  detect  176  out 
of  the  268  targets  and  the  SMI  with  K  =  999  training  data  can  detect  32  targets.  The  JDL  can 
detect  up  to  20  targets,  while  the  parametric  detectors  can  detect  at  most  49  targets. 

IPS  training  limiting:  To  improve  the  detection  performance,  we  limit  the  training  data  by 
applying  the  Innovation  Power  Sorting  (IPS).  In  this  case,  the  performance  for  each  technique  is 
listed  in  Table  7.1,  as  shown  in  the  parenthesis.  The  results  show  that,  with  IPS  training  selection, 
the  performance  of  the  parametric  detectors  improves  considerately.  When  J  —  11,  K  —  22  and 
P  =  2,  the  parameter  GLRT  detectors  can  detect  over  100  targets,  while  the  JDL  can  detect  38 
targets. 

Heterogeneous  Environment 

Since  the  KASSPER  2002  dataset  contains  the  true  interference  covariance  matrix  for  every  range 
bin,  we  can  realize  the  disturbances  by  following  Gaussian  processes  with  the  true  covariance 
matrices,  and  obtain  the  detection  performance  of  the  parametric  detectors  with  respect  to  SINR 
(Pd  versus  SINR  for  a  fixed  Pf).  Meanwhile,  we  have  shown  that  having  large  ratio  of  the  number 
of  pulses  to  the  number  of  channels  can  improve  the  performance  in  case  of  limited  training  [28]. 
Since  the  number  of  channel  in  KASSPER  dataset  is  fixed  as  N  =  32,  we  alternatively  generate 
the  CPI  data  for  J  =  4.  Since  the  disturbance  signals  generated  as  above  do  not  follow  an  AR 
model  in  general  and  are  more  realistic  than  our  earlier  simulated  data  using  an  AR  model,  it 
would  be  of  interest  to  determine  the  detection  performance  of  the  parametric  detectors  using 
this  simulated  data  set  and  compare  with  our  earlier  simulation  results  based  on  an  AR  model 
in  [28,40].  Besides  the  violation  of  AS2,  we  investigate  the  performance  of  the  parametric 
detectors  in  a  simulated  heterogenous  environment,  which  does  not  assume  i.i.d.  in  AS1.  To  this 
end,  we  generate  training  signals  using  different  covariance  matrices  from  that  of  the  range  bin 
under  test. 

In  this  simulation,  we  select  the  range  bin  200  as  the  testing  cell.  The  SINR  is  defined  as 

SINR  =  |a|2sHR-1s  (7.19) 
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where  the  space-time  covariance  matrix  R  is  chosen  by  loading  the  Rcc_r200.mat  in  the  KASSPER 
2002  dataset. 

Figure  7.3  presents  the  estimated  angle-Doppler  power  spectral  density  (PSD)  of  the  distur¬ 
bance  in  range  bin  200  when  J  —  11  and  N  —  32.  It  shows  that  there  exists  a  high  clutter 
centered  in  the  normalized  Doppler  frequency  of  0.1  and  azimuth  angle  of  195°.  Figures  7.4(a) 
and  7.4(b)  depict  the  probability  of  detection  versus  SINR  with  K  =  0  (no  training)  or  K  =  1 
(limited  training),  and  P  —  1.  For  the  no  training  case,  the  performance  of  both  parametric  de¬ 
tectors  degrades,  while  the  performance  is  improved  with  one  range  training  data.  It  is  worthy 
noting  that,  for  the  limited  training  case  K  =  1,  the  training  data  was  generated  by  using  dif¬ 
ferent  covariance  matrix  from  the  covariance  matrix  for  the  test  data,  which  may  be  true  in  the 
heterogeneous  environments. 

In  [28,40],  we  have  shown  that  the  performance  degradation  caused  by  the  absence  of  training 
can  be  mitigated  by  using  a  larger  N,  i.e.,  increasing  temporal  observations  of  the  test  signal. 
Since  N  =  32  is  fixed  in  the  KASSPER  2002  dataset,  we  alternatively  down-sample  the  CPI 
datacube  to  increase  the  ratio  of  N  to  J,  i.e.,  extracting  data  corresponding  to  J  =  4  channels 
from  the  original  dataset. 

Figure  7.4(c)  shows  the  case  with  J  =  4  and  N  =  32  with  no  training  data,  and  Figure  7.4(d) 
shows  the  cases  with  limited  training.  It  is  seen  that  having  larger  ratio  of  N  to  J  improves  the 
detection  performance.  Specifically,  with  K  =  1,  the  performance  of  the  parametric  detectors  is 
only  about  3  dB  inferior  to  the  ideal  MF  detector  with  known  covariance  matrix. 

7.4.2  Measured  MCARM  Dataset 

In  this  part,  the  MCARM  dataset,  the  real  world  multi-channel  airborne  data  containing  clutter 
in  various  terrains  including  mountains,  rural,  urban,  and  land/sea  interface,  is  considered.  The 
MCARM  data  is  collected  from  the  BAC  1-11  airborne  platform  in  the  F-Band  frequency.  The 
MCARM  array  has  16  columns,  each  consisting  of  two  four-element  subarrays.  Each  subarray 
has  its  own  output  or  is  combined  into  a  single  output  per  column  with  up  to  24  outputs  for  the 
array. 

Since  the  true  joint  space-time  covariance  matrix  of  MCARM  dataset  is  unavailable,  another 
power  measure  is  adopted  since  it  can  be  estimated  from  the  data.  It  is  the  input  SINR  (per-pulse, 
per-channel)  defined  as 

led2 

SINRiN  =  i-i-,  (7.20) 

where  a  is  the  target  amplitude  and  a‘j  denotes  the  variance  (power)  of  each  element  of  the 
disturbance  vector  at  each  time  instant.  We  carried  out  the  performance  analysis  of  the  parametric 
Rao  and  GFRT  detectors  using  two  acquisitions  from  the  MCARM  database.  More  specifically, 
rd050575  and  rel050152  have  been  used  to  assess  the  performance  of  the  parametric  detectors. 

Case  I  -  Inserted  Artificial  Target 

In  this  case,  we  choose  the  dataset  according  to  the  following  parameters: 
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•  One  elevation  angle  (0°) 

•  One  azimuth  angle  (0°) 

•  Four  channels  ( J  =  4) 

•  Various  K  (training),  N  (pulses),  and  normalized  Doppler  frequencies 

An  artificial  target  with  an  SINRiN  of  — 30dB  is  injected  in  the  range  bin  295.  The  disturbance 
power  aj  is  estimated  as  a  five-bin  average  centered  on  the  range  bin  in  which  the  target  is  placed. 
Model  order  values  P  =  1,  2, 3, 4  were  evaluated  for  each  parametric  tests  and  the  model  order 
with  the  best  performance  was  selected.  The  selection  criterion  is  the  difference  between  the 
target  peak  value  and  the  highest  non-target  peak  value.  Diagonal  loading  of  40  dB  for  the  AMF 
detector  is  applied. 

Figure  7.5  depicts  the  estimated  angle-Doppler  power  spectral  density  of  the  disturbance  in 
RB  295  when  J  =  4  and  N  =  128.  It  shows  that  there  exists  a  high  clutter  centered  in  the 
normalized  Doppler  frequency  of  —0.05  and  azimuth  angle  of  0°. 

Figure  7.6(a)  shows  the  case  where  the  target  is  located  outside  the  clutter  (the  normalized 
Doppler  frequency  is  0.2).  However,  if  the  target  is  located  inside  the  clutter  (the  normalized 
Doppler  frequency  is  0.1)  the  AMF  detector  does  not  work  at  all,  but  the  parametric  Rao  and 
GLRT  detectors  work  well  (See  Figures  7.6(c)  and  7.6(d)). 

Fig.  7.6(c)  shows  the  test  statistics  for  the  parametric  Rao  and  GLRT  detectors  without  train¬ 
ing  support,  the  AMF  with  K  =  8  and  the  JDL  with  K  =  8,  when  the  target  is  located  inside  the 
clutter.  Specifically,  the  AMF  and  JDL  cannot  work  without  training  data.  Clearly,  the  parametric 
Rao  and  GLRT  detectors  can  detect  the  inserted  target  with  over  20  dB  stronger  peak.  Fig.  7.6(d) 
shows  the  test  statistics  for  the  case  of  K  =  4. 

Case  II  -  Five  Targets 

In  this  case,  no  artificial  target  was  injected.  File  re050152  employs  a  moving  target  simulator 
in  the  radar  field  of  view  to  simulate  a  moving  point  source.  The  dataset  consists  of  N  =  128 
pulses,  J  =  22  channels  and  630  range  cells.  There  are  five  targets  in  total.  The  angle-Doppler 
power  spectral  density  for  range  bin  450  is  shown  in  Fig.  7.7,  when  J  =  22  and  N  =  128. 
It  shows  that  there  exist  five  targets  equi-spaced  in  normalized  Doppler  frequency,  and  centered 
azimuth  angle  of  —15°. 

It  shows  that  there  exist  five  targets  equi-spaced  in  normalized  Doppler  frequency,  and  cen¬ 
tered  azimuth  angle  of  —15°.  Figs.  7.8(a)  and  7.8(b)  depict  the  test  statistics  for  the  parametric 
Rao  and  AMF  detectors  when  J  =  22,  N  =  128,  P  =  1  and  K  =  11  or  K  =  22.  It  is  seen  that 
the  parametric  Rao  detector  detects  all  five  targets,  while  the  JDL  can  detect  up  to  four  targets 
and  the  AMF  detects  two  targets. 
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Table  7.2:  The  Operating  Parameters  for  the  Bistatic  Dataset 


Parameter 

Value 

carrier  frequency 

1240  MHz 

receiver  bandwidth 

1  MHz 

number  of  pulses 

64 

number  of  channels 

16 

number  of  range  bins 

601 

pulse  repetition  frequency 

400  Hz 

peak  transmit  power 

19.2  kW 

transmitting  platform  altitude 

3.1  km 

transmitting  platform  speed 

0  m/s  (stationary) 
100  m/s  (moving) 

receiving  platform  altitude 

4  km 

transmitting  platform  speed 

100  m/s 

tilt  angle  (Tx/Rx) 

5° 

7.4.3  Bistatic  Dataset 

The  Bistatic  dataset  is  simulated  bistatic  airborne  radar  data  contains  non  i.i.d.  and  range- 
dependent  clutter  due  to  bistatic  geometry.  The  bistatic  data  used  in  the  performance  evalua¬ 
tion  of  the  proposed  algorithms  was  generated  using  Stiefvater  Consultants  Signal  Modeling  and 
Simulation  (SMS)  tool  [57]  and  follows  closely  results  discussed  in  [58].  Two  cases  have  been 
considered.  In  both  cases,  the  receiver  is  assumed  to  be  moving  at  a  velocity  of  100  m/sec  while 
the  transmitter’s  velocity  is  assumed  to  be  0  m/sec  (case  I)  and  100  m/sec  (case  II),  with  an 
offset  angle  of  45°.  The  transmitter  center  frequency  is  fc  =  1.24GHz,  the  receiver  height  is 
Hr  =  3.1km,  the  transmitter  height  is  hr  =  4km,  and  the  baseline  separation  is  L  —  100km.  The 
pulse  repetition  frequency  (PRF)  was  assumed  to  be  400  Hz.  In  both  cases,  the  test  cell  is  located 
at  35  km  from  the  receiver  and  the  azimuth  of  the  test  cell  with  respect  to  the  receiver  is  135°, 
producing  a  bistatic  angle  of  33.9°.  The  test  cell  is  located  at  lm  above  the  ground.  In  each  case, 
a  linear  array  composed  of  J  =  32  sensors  and  N  =  64  pulses  have  been  used.  Data  for  601 
range  cells  was  generated  using  the  same  radar  system  parameters  as  those  used  in  the  MCARM 
experiment  [59].  The  bistatic  geometry  is  shown  in  Fig.  7.9. 

In  both  cases  there  is  only  one  target  in  the  test  cell  301.  The  operating  parameters  for  the 
simulated  Bistatic  dataset  are  shown  in  Table  7.2. 

Case  I  -  Non  i.i.d  Case 

In  the  case  of  a  stationary  transmitter,  note  from  Fig.  7.10(a)  that  unlike  the  monostatic  side¬ 
looking  airborne  radar,  where  all  traces  are  overlapping  and  the  clutter  spectral  centers  co-located, 
the  clutter  spectral  centers  are  spread  both  in  angle  and  Doppler.  This  generates  spectral  disper¬ 
sion,  thus  making  the  secondary  data  vary  with  range.  This  in  turn  makes  the  training  data  non 
independent  and  identically  distributed  (i.i.d),  hence  violating  a  fundamental  assumption  made 
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Table  7.3:  Case  I:  Difference  between  the  target  peak  and  the  highest  non-target  peak 


P-GLRT/ML 

P-Rao 

P-GLRT/AML2 

JDL 

K  =  0 

2.96  dB 

0.66  dB 

2.51  dB 

- 

K  =  2 

9.50  dB 

8.17  dB 

10.27  dB 

- 

K  =  4 

11.58  dB 

9.76  dB 

13.63  dB 

- 

K  =  8 

14.31  dB 

12.57  dB 

16.27  dB 

15.24  dB 

K  =  16 

17.25  dB 

15.64  dB 

19.08  dB 

17.29  dB 

in  several  STAP  algorithms. 

For  case  I,  Figure  7.11  shows  the  test  statistic  of  the  parametric  Rao  detector,  the  parametric 
GLRT  detector,  and  the  joint  domain  localized  (JDL)  with  3x3  local  processing  region  (LPR) 
with  respect  to  normalized  Doppler  frequency  and  range  bins.  The  detection  of  the  target  is 
marked  by  the  rectangle.  The  number  of  training  for  the  parametric  detectors  is  K  =  8.  For  the 
JDL,  when  the  number  of  training  is  limited  to  K  =  8  which  does  not  satisfy  the  full-rank  esti¬ 
mate  of  the  covariance  matrix  in  the  angle-Doppler  domain,  the  matrix  pseudo-inversion  is  used. 
To  facilitate  performance  comparison,  the  differences  between  the  target  peak  and  the  highest 
non-target  peak  for  various  detectors  are  shown  in  the  Tables  7.3.  Clearly,  the  parametric  Rao 
and  GLRT  detectors  can  detect  a  target  with  limited  or  even  no  training  data  support.  Specifi¬ 
cally,  the  parametric  GLRT  detector  can  detect  a  target  without  training  data  support  with  a  target 
detection  2.96  dB  higher  than  other  test  statistics.  When  the  number  of  training  is  increased,  the 
performance  of  all  detectors  improves.  For  K  =  18,  all  three  detectors  exhibit  similar  detection 
performance. 

Case  II  -  Range-dependent  Case 

In  the  case  of  a  moving  transmitter,  it  is  clear  from  Fig.  7.10(b)  that  the  angle-Doppler  traces  are 
non-overlapping  and  the  spectral  centers  are  highly  distributed  in  angle  and  Doppler,  due  to  the 
relative  motion  of  the  receive  and  transmit  platforms.  Significant  spectrum  dispersion  over  range 
then  arises,  which  leads  to  significant  performance  degradation  [58]. 

For  case  II,  Figure  7.12  shows  the  test  statistic  of  the  parametric  detectors  and  the  JDL  with 
respect  to  normalized  Doppler  frequency  and  range  bins.  The  number  of  training  is  specified  in 
the  figures.  In  extremely  case,  i.e.,  K  =  0,  the  parametric  Rao  and  GLRT  detectors  have  1.84  dB 
and  4.47  dB  stronger  output,  by  comparing  the  test  statistics  at  range  bin  301  with  ones  at  other 
range  bins.  When  K  =  8,  the  JDL  cannot  find  any  target,  while  the  parametric  Rao  and  GLRT 
detectors  can  detect  the  true  target  at  range  bin  301  with  12.94  dB  and  14  dB  stronger  amplitude, 
respectively.  By  increasing  the  number  of  training  to  K  =  16,  the  parametric  Rao  and  GLRT 
detectors  exhibit  a  13.92  dB  and  14.59  dB  higher  peak  at  range  bin  301,  while  the  JDL  finds  two 
distinct  peaks  which  are  located  at  range  bin  100  and  301,  respectively.  The  differences  between 
the  target  peak  and  the  highest  non-target  peak  for  various  detectors  in  this  case  are  shown  in  the 
Table  7.4,  where  the  symbol  —  represents  the  detector  fails  to  find  the  target,  and  negative  value 
means  the  wrong  detection  has  higher  peak  than  the  correct  detection. 
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Test 

Data 


peak 


Figure  7.  l':  The  flowchart  for  the  parametric  GLkT  detector 


7.5  Conclusions 


We  have  examined  the  performance  of  the  parametric  Rao  and  GLRT  detectors  using  the  KASSPER 
2002,  MCARM  and  Bistatic  datasets  include  many  real-world  effects.  Our  results  show  that  these 
parametric  detectors  work  quite  well  with  limited  or  no  range  training  data  support  in  more  re¬ 
alistic  environments.  Therefore,  they  are  good  candidates  for  solving  detection  problems  in  the 
presence  of  range  dependent  clutter  and/or  in  heterogeneous  environments. 
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Test 


Figure  7.2:  The  flowchart  for  the  parametric  Rao  detector 


Estimated  Power  Spectral  Density,  RB=  200 
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150  200 

angle  (degrees) 


250 


Figure  7.3:  The  angle-Doppler  power  spectral  density  of  the  range  bin  200  when  J 

N  =  32 


11  and 
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K=0,  J=1 1 ,  N=32,  P  =1 ,000000e-002,AR(1 )  K=1 ,  J=1 1 ,  N=32,  P  =1 .000000e-002,AR(1 ) 


(a)  (b) 


K=0,  J=4,  N=32,  P  =1 ,000000e-002,AR(1 )  K=1 ,  J=4,  N=32,  P  =1 .000000e-002,AR(1 ) 


(C) 


(d) 


Figure  7.4:  The  probability  of  detection  versus  input  SINR  for  the  KASSPER  2002  dataset. 
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Estimated  Power  Spectral  Density,  RB  =  295 


Angle  (degree) 


Figure  7.5:  Estimated  angle-Doppler  power  spectral  density  for  range  bin  295  when  J  =  4  and 
N  =  128. 
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Test  statistic  Test  statistic 


K=0,  J=4,  N=128,  P=1 


K=4,  J=4,  N=128,  P=1 


(a) 

K=0,  J=4,  N=128,  P=1 


(b) 

K=4,  J=4,  N=128,  P=1 


(C)  (d) 

Figure  7.6:  The  probability  of  detection  versus  input  SINR  for  the  KASSPER  2002  dataset. 
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Figure  7.7:  Estimated  angle-Doppler  power  spectral  density  for  range  bin  450  when  J  —  22  and 
N  =  128. 


K=1 1 ,  J=22,  N=128,  P=1 


K=22,  J=22,  N=128,  P=1 


Normalized  Doppler  frequency 

(a) 


Normalized  Doppler  frequency 

(b) 


Figure  7.8:  Test  statistics  of  the  parametric  Rao,  JDL  and  AMF  detectors  when  J  =  22,  N  = 
128, P  =  1  and  K  =  11  (a)  K  =  22  (b). 
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Figure  7.9:  Bistatic  airborne  radar  geometry. 


No  Compensation 


(a)  (b) 


Figure  7.10:  Bistatic  Angle-Doppler  traces,  Case  I  (a)  and  Case  II  (b). 


85 


GLRT:  K=4,  J=16,  N=64,  P=1 
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Range  bin 


JDL  (3x3):  K=4,  J=16,  N=64 
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Range  bin 


Rao:  K=4,  J=16,  N=64,  P=1 


100  200  300  400  500 

Range  bin 

JDL  (3  x  3):  K=8,  J=16,  N=64 


Range  bin 


Figure  7.11:  Case  I:  Test  statistics  of  parametric  GLRT,  parametric  Rao  and  JDL  with  respect 
to  normalized  Doppler  frequency  and  range  bins  when  J  —  16  and  N  =  64.  The  detection  is 
marked  by  the  rectangle. 
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GLRT:  K=8,  J=16,  N=64,  P=1 


JDL  (3  x  3):  K=8,  J=16,  N=64 
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Rao:  K=8,  J=16,  N=64,  P=1 
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JDL  (3x3):  K=16,  J=16,  N=64 
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Figure  7.12:  Case  II:  Test  statistics  of  parametric  GLRT,  parametric  Rao  and  JDL  with  respect  to 
normalized  Doppler  frequency  and  range  bins  when  J  =  16,  N  =  64.The  detection  is  marked  by 
the  rectangle.  The  JDL  with  K  =  8  fails  to  detect  any  target,  while  the  JDL  with  K  =  16  has  a 
strong  detection  around  range  bin  100  and  a  weak  detection  at  range  bin  301. 
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Chapter  8 

Parametric  Adaptive  Signal  Detection  for 
Hyperspectral  Imaging 

8.1  Introduction 

Hyperspectral  sensors  are  a  new  class  of  imaging  spectroscopy  sensors  that  divide  the  wave¬ 
band  of  interest  into  hundreds  of  contiguous  narrow  bands.  Their  fine  spectral  resolution  enables 
remote  identification  of  ground  objects  based  on  their  spectral  signatures.  Hyperspectral  imag¬ 
ing  (HSI)  has  a  wide  range  of  applications  including  terrain  classification,  environmental  and 
agricultural  monitoring,  geological  exploration,  ordinance  remediation,  tactical  surveillance,  and 
others  [60]. 

A  challenging  problem  in  HSI  applications  is  the  so-called  subpixel  target  detection ,  which 
involves  detecting  objects  occupying  only  a  portion  of  a  full  pixel  in  an  HSI  image  [4].  In  such  a 
case,  the  signal  produced  by  the  HSI  sensors  consists  of  both  the  object  and  background ,  the  latter 
behaving  effectively  as  interference  that  has  to  be  suppressed  for  effective  detection.  The  problem 
is  reminiscent  of  that  of  detecting  a  known  signal  with  unknown  amplitude  in  colored  noise  with 
unknown  correlation1  (e.g.,  [7]).  A  multitude  of  solutions  have  been  developed,  including  the 
Kelly’s  generalized  likelihood  ratio  (GLR)  test  [13],  adaptive  matched  filter  (AMF)  [12],  adaptive 
coherence  estimator  (ACE)  test  [14, 15],  among  others.  While  these  detectors  can  be  used  to  solve 
the  HSI  subpixel  target  detection  problem,  there  is  a  major  difficulty  with  them  in  training-limited 
scenarios.  In  particular,  the  above  detectors  are  covariance-matrix  based  techniques  in  that  they 
all  rely  on  an  estimate  of  the  background  covariance  matrix,  which  is  obtained  from  target- 
free  training  pixels.  The  size  of  the  background  covariance  matrix  is  identical  to  the  number  of 
spectral  bands  that  is  typically  in  the  order  of  hundreds.  A  good  estimate  of  the  covariance  matrix 
would  require  several  hundred  or  more  target-free  training  pixels,  which  may  not  be  available  in 
heterogeneous  or  dense-target  environments.  Another  problem  with  the  above  covariance-matrix 
based  detectors  is  complexity,  since  the  large-size  covariance  matrix  has  to  be  estimated  and 

'We  take  a  stochastic  approach  herein  by  modeling  the  background  as  a  correlated  random  vector  with  an  un¬ 
known  covariance  matrix.  There  are  other  detectors  based  on  modeling  the  background  as  a  deterministic  quantity. 
See  [4,61,62]  and  references  therein  for  details. 
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inverted  frequently. 

There  is  a  significant  interest  in  developing  training-efficient  detection  techniques  for  training- 
limited  applications,  such  as  the  above  HSI  target  detection  approaches  applied  in  heterogeneous 
environments.  Another  example  is  target  detection  based  on  space-time  adaptive  processing 
(STAP)  for  airborne  radars  [1],  where  range-dependent  clutter  characteristics,  along  with  other 
issues,  prevent  inclusion  of  a  large  number  of  range  cells  far  away  from  the  test  cell  in  the  training 
set.  One  effective  way  to  reduce  training  requirement  in  STAP  detection  is  to  utilize  a  suitable 
parametric  model  for  the  radar  clutter  and  exploit  the  model  for  target  detection.  In  particular, 
multichannel  autoregressive  (AR)  models  have  been  found  to  be  very  effective  in  representing 
the  temporal  correlation  among  pulse  returns  [22]-  [25].  A  parametric  detector  based  on  such  a 
multichannel  AR  clutter  model  is  developed  in  [22]-  [23],  which  is  referred  to  as  the  parametric 
adaptive  matched  filter  (PAMF).  The  PAMF  detector  has  been  shown  to  significantly  outperform 
the  covariance-matrix  based  detectors  for  small  training  size. 

For  HSI  applications,  however,  the  data  is  non-stationary  in  the  spectral  domain  (see  Sec¬ 
tion  8.4.1  for  details  of  such  non-stationarity),2  wheras  AR  models  are  by  definition  stationary. 
To  account  for  such  non-stationarity,  we  introduce  in  this  chapter  a  sliding-window  based  non- 
stationary  AR  (NS-AR)  model  to  capture  the  spectral  correlation  of  HSI  data.  We  propose  a  class 
of  parametric  adaptive  signal  detectors  for  HSI  subpixel  target  detection,  and  develop  a  maxi¬ 
mum  likelihood  (ML)  estimation  algorithm  to  estimate  the  parameters  associated  with  the  NS-AR 
model.  In  addition,  we  develop  model  order  selection,  training  screening,  and  time-series  based 
whitening  and  detection  techniques,  which  are  intrinsic  parts  of  the  proposed  parametric  adaptive 
detectors.  We  show  via  experimental  results  with  real  HSI  data  that  our  proposed  parametric 
detectors  are  more  efficient  in  training  usage  and  outperform  the  conventional  covariance-matrix 
based  detectors  when  the  training  size  is  limited. 

The  rest  of  the  chapter  is  organized  as  follows.  Section  8.2  contains  the  data  model  and 
problem  statement.  The  covariance-matrix  based  detectors  are  briefly  reviewed  and  discussed  in 
Section  8.3.  The  proposed  techniques,  including  an  NS-AR  model,  a  class  of  parametric  adaptive 
detectors,  an  ML  parameter  estimation  algorithm,  a  model  order  selection  method,  and  a  training 
screening  approach,  are  detailed  in  Section  8.4.  Experimental  results  illustrating  the  performance 
of  the  proposed  detectors  under  homogeneous,  heterogeneous,  and  dense-target  environments  are 
presented  in  Section  8.5.  Finally,  Section  8.6  contains  our  concluding  remarks. 


8.2  Data  Model  and  Problem  Statement 

Obtained  through  both  spatial  and  spectral  sampling,  HSI  data  is  usually  described  as  a  dat- 
acube,  whose  face  is  a  function  of  the  spatial  coordinates  and  depth  is  a  function  of  spec¬ 
tral  bands  or  wavelengths.  Each  pixel  can  be  represented  as  an  L  x  1  real-valued  vector: 
x  =  [x(0),  x(l), . . . ,  x(L  —  1)]T,  where  L  denotes  the  total  number  of  spectral  bands,  x(l)  de¬ 
notes  the  spectral  response  at  the  Zth  spectral  band,  and  f-)7  denotes  transpose.  Since  HSI  data 

2Such  spectral  non-stationarity  shall  not  be  confused  with  the  spatial  stationarity  which  is  often  assumed  for  HSI 
data  [4], 
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has  non- zero  mean  [4,65],  a  preprocessing  stage  is  usually  invoked  to  remove  the  sample  mean 
estimated  using  the  neighbor  pixels. 

In  vector  notation,  the  subpixel  signal  detection  problem  is  described  by  the  following  com¬ 
posite  hypothesis  test  [4]: 


Hq  :  x  =  b.  target  absent 

(8.1) 

Hi  :  x  =  as  +  b,  target  present 

where  x  e  Mixl  is  the  demeaned  test  pixel,  s  e  Mixl  is  the  signature  vector  of  the  target 
object  with  amplitude  a,  and  b  e  MLxl  denotes  the  background  plus  system  noise.  We  adopt 
the  standard  assumption  that  the  signature  vector  s  is  deterministic  and  known  to  the  detector;3 
the  amplitude  a,  however,  is  assumed  unknown.  For  the  background,  we  follow  a  statistical 
approach  that  models  the  background  interference  b  as  a  multivariate  Gaussian  random  vector 
with  zero  mean  and  an  unknown  covariance  matrix  Ri,  =  E{bbJ  }.  The  Gaussian  assumption 
has  been  widely  used  for  multispectral  (e.g.,  [65])  and  HSI  data  [4].  It  leads  to  mathematical 
tractability  and  reasonably  good  performance.  Nevertheless,  it  should  be  noted  that  a  Gaussian 
model  is  not  fully  appropriate  to  characterize  the  statistical  behavior  of  HSI  data  in  many  realistic 
cases,  and  alternative  modeling  approaches  have  been  considered  in  [66]-  [68]. 

Equation  (8.1)  implies  that  the  background  interference  covariance  matrix  is  the  same  under 
both  hypotheses.  Since  for  a  subpixel  target  the  area  covered  by  background  is  different  under 
the  two  hypotheses,  it  is  more  appropriate  to  consider  the  following  modified  hypothesis  [4,69]: 


Hq  :  x  =  b,  target  absent 

Hi  :  x  =  as  +  ab,  target  present, 


(8.2) 


where  a  is  unknown  and,  along  with  the  signature  amplitude  a,  determined  by  the  target  fill 
factor,  i.e.,  the  percentage  of  the  pixel  area  occupied  by  the  target  [4]. 

Similar  to  [4],  we  assume  that  in  addition  to  the  test  pixel  x,  we  have  N  training  pixels 
Xi,. ..  ,xN.  In  surveillance  applications  when  the  target  class  is  rare  or  sparsely  populated, 
the  training  pixels  are  usually  taken  as  those  surrounding  the  test  pixel  and  assumed  target-free 
[4].  Again  similar  to  [4],  we  assume  that  aq, . . . ,  xN  are  independent  and  identically  distributed 
(i.i.d.)  Gaussian  random  vectors  with  zero  mean  and  covariance  matrix  Rb,  and  independent  of 
the  text  pixel  x. 

The  problem  in  question  is  to  find  an  efficient  decision  rule  for  the  composite  hypothesis 
testing  problem  (8.1)  or  (8.2),  given  knowledge  of  the  test  pixel  x,  target  signal  signature  s,  and 
training  pixels  aq, . . . ,  xN.  Our  goal  is  to  achieve  good  detection  performance  for  small  N. 

Before  closing  this  section,  we  remark  that  our  parametric  detection  schemes,  as  well  as  many 
others  (e.g.,  in  [4]),  rely  on  the  perfect  knowledge  of  the  target  spectral  signature.  Generally,  the 
target  signature  is  available  in  its  reflectance  spectrum ,  whereas  the  HSI  sensors  measure  the 
radiance  spectrum  of  the  observed  materials.  In  order  to  apply  these  detection  techniques,  the 

3The  spectral  signature  may  vary  due  to  variations  in  atmospheric  conditions  and  other  factors,  and  the  uncer¬ 
tainty  can  be  captured  by  a  linear  mixing  model  [4],  We  do  not  consider  such  spectral  variations  since  our  focus  is 
effective  cancellation  of  the  background. 
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HSI  data  must  be  pre-processed  to  obtain  reflectance  data  from  the  radiance  ones  (e.g.,  through 
atmospheric  correction)  or,  alternatively,  target  spectral  reflectance  must  be  processed  to  obtain 
the  radiance  spectrum.  See  [70,71]  for  details. 


8.3  Covariance-Matrix  Based  Solutions 

A  number  of  solutions  to  the  above  problem  have  been  developed.  If  the  covariance  matrix  Rb 
is  known  exactly,  the  optimum  detector  for  (8.1)  with  unknown  signal  amplitude  is  the  matched 
filter  (MF)  [12]: 

(8.3) 


Is1  Rh  lx\2  Hi 

1  1  ^*MF, 


sTRb  1s 


H0 


where  tM f  denotes  the  MF  threshold.  The  MF  detector  is  obtained  by  a  GLR  approach  (e.g.,  [7]), 
by  which  the  ML  estimate  of  the  unknown  amplitude  a  is  first  estimated  and  then  substituted 
back  into  the  likelihood  ratio  to  form  a  test  statistic.  In  practice,  the  MF  detector  cannot  be 
implemented  since  Ri,  is  typically  unknown.  However,  it  provides  a  baseline  for  performance 
comparison  when  considering  any  realizable  detection  scheme. 

In  practice,  the  unknown  Rb  can  be  replaced  by  some  estimate,  such  as  the  sample  covariance 
matrix  obtained  from  the  training  pixels: 


1  A 

Rb  —  4  xnxn  ■ 

n= 1 


(8.4) 


Using  Rb  in  (8.3)  leads  to  the  so-called  AMF  detector  [12]: 


-i 


s  Rh  x 


STRh  S  Ho 


H  i 

^  ^AMF; 


(8.5) 


where  /  Ami-  denotes  the  AMF  threshold. 

Alternatively,  one  can  treat  both  a  and  Rb  as  unknowns  and  estimate  them  successively  by 
ML.  Such  a  GLR  approach  was  pursued  by  Kelly  [13],  which  gives  the  following  Kelly  test: 


-l 


s  Rh  x 


sTRh  1s]  (N  +  xTRh  1 x  )  Ho 


H  i 

^  t Kelly; 


(8.6) 


where  fKeiiy  denotes  the  corresponding  threshold. 
Another  popular  detector  is  the  ACE  test  [14, 15]: 

1 2 


-1 


s  Rh  x 


sTRh  1s 


xTRb  x 


H  i 

^  f  ACE; 
H0 


(8.7) 
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which  is  obtained  by  a  GLR  procedure  that  takes  into  account  not  only  the  unknown  amplitude  a 
and  background  covariance  matrix  Ri,.  but  also  the  variability  of  the  variance  of  the  background 
under  //0  and  Hi.  Interestingly,  the  ACE  test  is  the  AMF  test  (8.5)  normalized  by  the  signal 
energy  weighted  by  the  covariance  matrix  inverse  Rh  .By  the  Schwartz  inequality,  one  can  see 
that  the  ACE  test  statistic  is  bounded  between  zero  and  one. 

The  AMF,  Kelly  and  ACE  tests  have  constant  false  alarm  rate  (CFAR).  However,  they  entail 
a  large  training  requirement.  The  covariance  matrix  Rb  has  a  dimension  of  Lx  L.  Typical  values 
for  L  in  real  HSI  systems  are  in  the  range  of  hundreds.  An  accurate  estimate  of  the  covariance 
matrix  would  require  a  large  number  of  target-free  training  pixels,  which  may  not  be  available, 
especially  in  non-homogeneous  environments.  In  addition,  the  computational  complexity  of  these 
detectors  is  high,  since  Rb  has  to  be  estimated  and  inverted  frequently. 


8.4  Proposed  Approach 

In  this  section,  we  present  a  class  of  parametric  adaptive  signal  detectors  with  reduced  training 
requirement.  The  proposed  detectors,  which  are  detailed  in  Section  8.4.2,  relies  on  an  NS-AR 
model  introduced  in  Section  8.4.1,  an  ML  parameter  estimation  algorithm  derived  in  Section 
8.4.3,  a  model  order  selection  method  discussed  in  Section  8.4.4,  and  a  training  screening  tech¬ 
nique  presented  in  Section  8.4.5. 

8.4.1  Parametric  Modeling  of  HSI  Data 

It  is  well-known  that  the  interference  suppression  ability  of  the  detectors  discussed  in  Section 
8.3  comes  from  a  whitening  procedure.  Consider,  for  example,  the  AMF  detector  (8.5).  The 
whitening  operation  takes  as  inputs  the  signature  vector  s  and  test  pixel  x,  and  outputs  whitened 
versions: 

s  =  Rb  1/2s,  x  =  Rb  1/1 2  x ,  (8.8) 

-  -1/2  -  -1 

where  Rb  denotes  the  matrix  square-root  of  Rb  .  Following  the  whitening,  the  AMF  detector 
reduces  to  simple  correlation  of  the  whitened  outputs: 

\sTx\2  Hi 

h if-  (8.9) 

S  S  Ho 

If  the  whitening  operation  can  be  designed  or  approximated  via  a  parametric  model  without 
explicitly  estimating  Rb,  then  it  is  conceivable  that  fewer  training  pixels  are  needed,  provided  that 
the  parametric  model  is  parsimonious  enough  (without  an  extraordinary  number  of  parameters). 
This  is  the  essence  of  our  parametric  model  based  methods.  Next,  we  consider  two  different 
parametric  modeling  approaches. 
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AR  Modeling 


AR  models  have  been  popular  choices  for  parametric  modeling  in  spectral  analysis,  speech  cod¬ 
ing,  wireless  channel  modeling,  seismic  signal  processing,  among  others  (e.g.,  [30]).  Parametric 
adaptive  detection  based  on  multichannel  AR  models  has  been  considered  in  [22]-  [23,72,73]  for 
airborne  radar  systems  equipped  with  multiple  antennas.  It  was  shown  that  significant  saving  in 
training  and  complexity  can  be  achieved  by  fitting  the  interference  and  radar  clutter  into  suitable 
multichannel  AR  models. 

For  the  problem  under  study,  the  Lx  1  background  vector  b.  or  equivalently  the  observed 
signal  x  under  H0,  may  be  assumed  to  be  a  scalar  AR  process  which  produces  the  L  samples 
of  b.  If  an  AR  model  is  appropriate  for  HSI  data,  then  the  detection  problem  amounts  to  first 
estimating  the  AR  coefficients  from  training  data,  whitening  the  signals  by  a  whitening  filter 
constructed  from  the  AR  coefficient  estimates,  and  computing  the  decision  statistic  from  the 
whitened  signals  followed  by  thresholding.  For  brevity,  the  above  approach  is  referred  to  as  the 
parametric  adaptive  matched  filter  (PAMF)4,  or  normalized  PAMF  (NPAMF)  [72]  if  the  decision 
variable  is  normalized,  similar  to  the  normalization  imposed  by  the  ACE  detector  of  (8.7). 

We  have  tested  the  above  AR-based  PAMF/NPAMF  detectors  with  real  HIS  data  using  fixed 
AR  parameters  across  the  spectral  domain  and  found  they  suffer  a  performance  loss  compared  to 
the  methods  proposed  here.  The  reason  is  that  AR  models  are  not  a  suitable  parametric  model 
for  HSI  data.  In  particular,  we  find  that  HSI  data  are  non- stationary  in  the  spectral  dimension , 
whereas  fixed  parameter  AR  models  characterize  stationary  random  processes.  To  see  this,  we 
have  computed  the  sample  covariance  matrix  R,  from  a  total  of  K  =  24  x  46  =  1104  training 
pixels  drawn  from  a  homogeneous  region  of  the  HSI  data  described  in  Section  8.5.  Figure  8.1 
depicts  the  main  and  3  sub-  diagonals  of  R,  which  correspond  to  the  autocorrelation  function 
(ACF)  at  spectral  lag  0  (i.e.,  variance),  lag  1,  lag  2  and  lag  3,  respectively,  versus  the  spectral 
bands.  Clearly,  the  signal  is  not  stationary  since  the  variance  and  ACF  at  other  lags  vary  signifi¬ 
cantly  across  the  spectral  bands. 

NS-AR  Modeling 

Although  HSI  data  is  non-stationary  (NS)  across  the  entire  spectral  dimension,  it  may  be  consid¬ 
ered  approximately  stationary  over  a  sufficiently  small  number  of  adjacent  spectral  bands.  This 
can  seen  from  Figure  8.1  where  the  variation  of  the  sample  statistics  over  a  few  adjacent  spectral 
bands  is  considerably  smaller  compared  with  that  over  the  entire  spectral  bands.  In  the  follow¬ 
ing,  we  consider  a  NS-AR  modeling  approach  by  taking  into  account  such  local  stationarity  of 
HSI  data.  Specifically,  let  xn(l)  denote  the  spectral  response  at  the  Zth  spectral  band  of  the  nth 
training  pixel  xn,  that  is,  xn  =  [xn(0), . . . ,  xn(L  —  1)]T .  Then,  we  slice  xn  into  into  L  —  Ls  +  1 
overlapping  sub  vectors: 

*n,t  =  ■  ■  ■  i  Xn(l  +  Ls  l)j  , 

/  0, . . . ,  L  L5, 

4Details  of  the  PAMF  and  NPAMF  detectors  can  be  inferred  from  the  proposed  NS-PAMF  and  NS-NPAMF 
detectors  discussed  in  Section  8.4.2,  as  the  former  are  special  cases  of  the  latter. 
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where  Ls  <  L  denotes  the  length  of  the  subvectors.  Equivalently,  these  subvectors  can  be  thought 
of  as  being  obtained  by  windowing  xn  using  a  sliding  window  of  size  Ls.  For  sufficiently  small 
Ls ,  each  subvector  xrij  can  be  modeled  as  an  Mth-order  AR  process: 


Xn(k) 


M 

=  ~  ^  ai(m)xn(k  -  m)  +  wnj(k), 

m=  1 


k  —  —  lj  77/  —  1,..., N, 


(8.11) 


where  wnj(k )  denotes  the  modeling  residual  for  the  Zth  subvector  xnj.  The  residual  is  Gaus¬ 
sian  (since  xn(k )  is  so)  with  zero-mean  and  variance  of,  and  spectrally  white  so  that  {wnj(k)} 
are  independent  with  respect  to  k  and  n  [30].  Note  that  the  Zth  set  of  the  AR  coefficients, 
ai(  1),  •  •  • ,  cli(M),  is  associated  with  the  Zth  subvector  xUji,  and  that  different  subvectors  are  asso¬ 
ciated  with  different  sets  of  AR  coefficients.  For  simplicity,  we  consider  fixed  AR  model  order 
M  (also  see  discussions  in  Section  8.4.4). 

From  the  estimation  perspective,  the  choice  of  M  and  window  size  Ls  should  be  made  with 
tradeoffs  among  the  bias,  variance  and  stationarity  of  the  modeling  approach.  A  large  M  might 
be  desirable  since  it  can  provide  better  fitting  (lower  bias)  to  the  HSI  data.  Increasing  M,  how¬ 
ever,  would  require  the  window  size  Ls  to  increase  accordingly  since  more  parameters  are  to 
be  estimated  and,  therefore,  more  data  should  be  provided  within  each  subvector  to  reduce  the 
variance  of  parameter  estimates.  If  Ls  is  too  large,  the  assumption  of  stationarity  within  the  sub¬ 
vector  may  be  violated,  which  can  cause  significant  degradation.  From  the  application  aspect, 
however,  these  parameters  are  related  to  the  HSI  sensor  characteristics,  such  as  the  operating 
spectral  range,  spectral  resolution,  etc.  For  the  HSI  data  used  in  this  chapter,  we  found  a  window 
size  8  <  Ls  <  15  is  generally  appropriate  for  modeling.  Once  Ls  is  selected,  we  can  use  infor¬ 
mation  criterion  based  model  order  selection  techniques  to  determine  M.  We  leave  the  details  to 
Section  8.4.4. 

Instead  of  the  above  sliding-window  based  NS-AR  modeling  approach,  one  can  consider  an 
alternative  NS-AR  model  that  models  the  HSI  data  across  all  the  spectral  bands: 

M 

xn{l)  =  ~  bi(m)xn(l  -  m)  +  vn(l), 

m= 1  ^  '  ' 

Z  =  0, . ,L-  1, 


where  bi(m)  denotes  the  shift- varying  AR  coefficient  and  vn(l)  the  fitting  error  of  the  Zth  sample. 
Note  that  the  above  model  differs  from  (8.11)  in  that  the  AR  coefficients  are  varying  from  sample 
to  sample,  whereas  in  (8.11),  they  are  assumed  to  remain  fixed  within  a  sliding  subvector  of 
Ls  samples.  An  additional  parametric  model  for  the  shift- varying  AR  coefficients  { hi  (m  ) }  is 
necessary  to  ensure  they  can  be  estimated.  This  doubly  parametric  approach  is  more  sensitive 
to  the  choice  of  the  parameters,  whose  estimation  is  also  considerably  more  involved.  In  the 
following,  we  consider  only  the  sliding-window  based  NS-AR  modeling  approach. 
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8.4.2  NS-AR  Model  Based  Parametric  Adaptive  Detectors 

If  the  above  NS-AR  model  (8.11)  is  appropriate  for  modeling  target-free  HSI  data  (i.e.,  the  back¬ 
ground),  then  a  time-series  based  (as  opposed  to  the  previous  covariance-matrix  based)  whiting 
process  can  be  developed  without  explicitly  estimating  R^.  This  leads  to  a  class  of  parametric 
adaptive  detectors  that  are  summarized  below: 

•  Step  1  -  Parameter  Estimation:  Estimate  the  NS-AR  coefficients  {ai(m)}  in  (8.11) 
and  the  variance  {of}  of  the  residual  from  the  training  pixels  {xn}^=1  by  using  an  ML 
based  estimation  algorithm  detailed  in  Section  8.4.3.  Let  {di(m),  of}  denote  the  coefficient 
estimates. 


•  Step  2  -  Whitening:  Lorm  a  shift-varying  moving-average  (MA)  whitening  filter  from 
the  parameter  estimates  {di(m),df},  and  whiten  the  test  pixel  x  and  target  signature  s  as 
follows: 


x(l)  ——  x(l)  +  ai-Ls(m)x{l  -  m)  , 

<*i  i 

L  771=1  J 

1  [  M 

s(l)  =—  s(Z)  +  y'a,-L#(m)s(ZV  m)  , 

<Ji  i 

L  m=  1  J 

/  Ls  \ ,1 .  I 


(8.13) 


where  x{l)  and  s(l)  denote  the  Zth  output  sample  of  the  whitening  filter  when  the  input  is 
the  test  pixel  x  and  target  signature  s,  respectively.  It  should  be  noted  from  (8.13)  that  each 
set  of  the  NS-AR  parameter  estimates,  i.e.,  {d;(m)}^=1  and  dt,  is  used  to  compute  one  pair 
of  output  samples  x(l)  and  s(l);  as  the  sliding  window  shifts  to  the  next  position,  we  use  the 
next  set  of  parameter  estimates  for  whitening.  In  effect,  (8.13)  implements  the  whitening 
operation  (8.8)  in  a  time-series  fashion  by  taking  into  account  the  NS  nature  of  the  signal. 
Lor  an  input  of  L  spectral  samples,  the  time-series  based  whitening  filter  outputs  L  —  Ls 
whitened  samples  due  to  initialization  of  the  whitening  filter.  Although  such  dimensionality 
reduction  may  affect  the  detection  performance,  the  impact  is  negligible  for  small  Ls  and 
large  L,  which  is  typical  in  HSI  systems. 


•  Step  3  -  Detection:  The  outputs  of  the  shift-varying  whitening  filter  corresponding  to  the 
test  pixel  x  and  target  signature  s,  respectively,  are  used  to  form  the  decision  statistic.  De¬ 
pending  upon  how  the  decision  statistic  is  formed,  we  have  a  class  of  parametric  detectors. 
Lor  example,  the  parametric  counterparts  of  the  covariance-matrix  based  AML  (8.5)  and 
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Finally,  it  is  noted  that  the  above  NS-PAMF  and  NS-NPAMF  detectors  reduce  to  the  PAMF  and 
NPAMF  detectors,  respectively,  that  are  briefly  discussed  in  Section  8.4.1,  when  Ls  =  L,  that  is, 
the  sliding  window  reaches  the  maximum  value  and  includes  the  entire  spectral  bands.  In  that 
case,  the  NS-AR  model  in  (8.11)  reduces  to  the  standard  stationary  AR  model. 


8.4.3  ML  Estimation  of  NS-AR  Coefficients 

Parameter  estimation  plays  a  critical  role  for  the  proposed  parametric  detectors.  In  this  section, 
we  present  an  ML  estimator  to  estimate  the  NS-AR  coefficients  in  (8.11)  using  training  pixels 
Xi, . . . ,  xN.  Our  ML  estimator  is  an  extension  of  that  in  [30]  for  fixed  AR  models  to  NS-AR 
processes. 

Consider  the  vector  of  AR  coefficients  of  model  order  M:  ai  =  [a;(l),  . . . ,  a;(M)] 1 .  Ac¬ 
cording  to  the  statistical  assumptions  made  in  Section  8.2  and  the  NS-AR  model  (8.11),  the  Zth 
set  of  subvectors  Xij, . . . .  xNi  formed  from  the  N  training  pixels  are  i.i.d.  multivariate  Gaussian 
whose  joint  probability  density  function  (PDF)  is  parameterized  by  the  NS-AR  coefficients  ai 
and  variance  erf.  Then,  the  ML  estimates  of  a,/  and  of  are  obtained  by  maximizing  the  joint 
PDF  p(x\j, . . . ,  Xjsr/,  ai,  of).  Exact  maximization  of  the  joint  PDF  with  respect  to  the  unknown 
parameters  turns  out  to  be  highly  involved  computationally  [74].  Instead,  we  seek  to  optimize 
a  conditional  PDF,  which  produces  an  asymptotic  ML  estimate  of  the  parameters  for  large  date 
size  [30].  Specifically,  let 

Xnl  =  [xn(l),  •••,  Xn(l  +  M-  1)]T,  (8.16) 

Xn]  =  [Xn(l  +  M),  ...,  Xn{l  +  Ls  -  1)]T  (8.17) 

which  collect  the  first  M  and,  respectively,  the  last  Ls  —  M  samples  of  xnj.  Thus,  we  have  xn  i  = 
xif,  x(nf  .  Our  asymptotic  ML  estimator  seeks  to  maximize  the  joint  conditional  PDF 
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n  I  x^  x^ 

/'  (  **'!./  5  •  •  •  >  **\Y./ 


aij1; , ... ,  ccfff  a;,  af  J  with  respect  to  a;  and  of.  We  will  write  the  conditional 


.(i). 


PDF  as  p  I  x 


d2) 


x\  ;  ai,  erf  )  for  brevity. 


To  find  an  explicit  form  of  the  above  conditional  PDF,  we  observe  from  (8.11)  that 


M 


wn>i(k)  =xn{k)  +  ^2  ai{m)xn(k  -  m), 


m= 1 


(8.18) 


k  —  l,  l  +  1, . . . ,  l  +  Ls  —  1;  n  —  1, . . . ,  N. 

Since  {wnj}  are  i.i.d.  Gaussian  with  zero  mean  and  variance  af,  we  have  (e.g.,  [30]) 


P\xi 


(2) 


(1)  2 
xi  ,-ai,al 


N 


=  (2naf)-^-M^  eW  {  -  ^  £ 


l  n  — 


n=  1 


(8.19) 


l-\-Ls  —  1 

E 

k=l+M 


M 


*»(*o  +  £“  m,ixn{k  -  m) 


m=  1 


Maximizing  the  above  conditional  PDF  is  equivalent  to  minimizing  the  negative  log  likelihood 
function 


(2) 


xf]\ahaf 


V(at,  af)  =  -  In p  (x 

Define  an  (Ls  —  M)  x  M  matrix 

xn(l  +  M  -  1)  ...  xn(l) 

Xn,  :  :  i 

xn(l  -\-  Ls  —  2)  ...  xn((  +  —  iff  —  1) 

Then,  1/  (a/,  of)  can  be  more  compactly  expressed  as 

V(ah  af)  =  Cf  +  ^iV(Ls  -  M)  In  af 

Af 


(8.20) 


(8.21) 


2^.2  I^n,/  +  Xnj(li 

l  n=  1 


(8.22) 


where5 


C'1  =  -iY(Ls-M)ln(2vr), 


Taking  the  derivative  of  1/ (a/,  af )  with  respect  to  af  and  setting  it  to  zero  yield 


N 


df(af  = 


Xn\  + 


iV(L,  -  M)  ^ 


(8.23) 


(8.24) 


5We  keep  the  constant  term  C\  which  depends  on  M  for  model  order  selection  in  Section  8.4.4. 
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Substituting  of (cq)  back  into  (8.22)  reduces  the  cost  function  to 


where 


V(ah  of)  =  Ci  +  C2  +  -  N(LS  -  M)  In  of  (a,), 


C2  =  -N(Ls-M). 


(8.25) 


(8.26) 


Therefore,  the  ML  estimate  of  a,/  is  obtained  by  minimizing  <3f  (a/),  the  variance  of  the  NS-AR 
modeling  residual.  The  solution  is  obtained  by  least-squares  fitting: 


N 


-1 


N 


&l  =  ~  E  XhXn,l  £  X 


T  (2) 

I  "> 


\n= 1 


<n=  1 


(8.27) 


/  =  0,1,...,L-LS. 


The  matrix  within  the  first  pair  of  brackets  is  assumed  non-singular.  A  necessary  condition  for 
non-singularity  is  that  the  number  of  training  pixels  N  is  such  that 


N  > 


M 


Le  —  M 


(8.28) 


This  is  because  the  above  matrix  inverse  can  be  expressed  as  (Xf  Xi)  ,  where 

xi  =  [xTlt,  ...,xyT 


(8.29) 


M 


is  a  tall  matrix  when  the  above  condition  is  satisfied.  On  the  other  hand,  when  N  >  L  _M, 
x?xt  is  full  rank  almost  surely  due  to  the  random  nature  of  the  HSI  data. 

Finally,  substituting  the  ML  estimate  (8.27)  back  into  (8.24)  yields  the  minimum  variance  of 
the  residual: 

1 


cr,  = 


N(LS  -  M) 


.(2 )Tpl  ^,(2) 

'l  ^ Xixl  ■ 


(8.30) 


where  x\2^  = 


x 


(2  )T 
1,1  ) 


X 


(2  )T 
N,l 


and  Px,  is  the  projection  matrix  onto  the  null  space  of  Xf. 
PLXi  =I-Xl(XjXl)-1Xj,  (8.31) 


where  I  is  an  identity  matrix. 

8.4.4  NS-AR  Model  Order  Selection 

In  this  section,  we  develop  information  criterion  based  model  order  selection  techniques  to  deter¬ 
mine  the  NS-AR  model  order  M  in  (8.11).  Although  in  principle  it  is  possible  to  select  a  different 
M  for  each  subvector  xn  i,  l  =  0, . . . ,  L  —  Ls,  by  a  separate  fitting  of  M  to  the  information  cri¬ 
terion,  this  is  a  tedious  process.  In  the  following,  we  use  a  fixed  M  for  all  l. 
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Model  order  selection  for  parametric  models  is  a  classical  topic  and  has  been  investigated  by 
various  researchers  for  various  models  (e.g.,  [30,31]  and  references  therein).  We  examine  herein 
the  model  order  selection  problem  for  the  NS-AR  model  in  (8.1 1)  for  the  HSI  application,  which 
appears  not  to  have  been  addressed  elsewhere.  Specifically,  we  consider  a  generalized  Akaike 
information  criterion  (GAIC),  which  chooses  the  model  order  M  that  minimizes 

L-Ls 

W(M)  =  Wi(M)  +  7 (M)} ,  (8.32) 

1=0 

where  VfM)  is  the  minimum  cost  associated  with  the  Zth  set  of  subvectors  c Cy, . . . ,  xN and 
7 (M)  is  a  penalty  term  that  penalizes  increasing  model  order  [31].  Specifically,  the  minimum 
cost  is  derived  in  Section  8.4.3  (cf.  (8.25)) 

Vi(M)  =  Ci(M)  +  C2{M)  +  \n(Ls  -  M)  In  (8.33) 

where  Ci(M),C2(M)  and  of  ( M )  are  given  by  (8.23),  (8.26)  and  (8.30),  respectively,  and  the 
dependence  on  M  is  made  explicit.  On  the  other  hand,  the  penalty  term  typically  takes  the 
form  [31] 

7 (M)  =  a(M  +  1)  In (NLS),  (8.34) 

or 

7 (M)  =  a(M  +  1)  In  [In (iVLs)] ,  (8.35) 

where  M  +  1  is  the  total  number  of  unknowns  for  each  set  of  subvectors  {xnj}ff=1,  NLS  is  the 
number  of  data  samples  contained  in  {xnj}ff=1,  and  a  >  2  is  a  parameter  of  user  choice.  Note 
that  the  above  GAIC  reduces  to  the  standard  AIC  [75]  when  the  (L  —  Ls  +  l)-term  summation 
in  (8.32)  vanishes  and  7 (M)  =  2 (M  +  1).  It  is  known  that  AIC  is  not  a  consistent  model  order 
estimator  [30].  Choosing  a  penalty  term  proportional  to  In (NLS)  or  In  [ln(iVLs)]  is  an  effective 
way  of  obtaining  consistent  order  estimates  [31]. 

8.4.5  Training  Screening 

One  assumption  made  in  Section  8.2  is  that  the  N  training  pixels  X\ ,Xn  are  target-free.  This 
assumption  is  reasonable  in  homogeneous  environments  where  targets  are  rare  or  sparsely  popu¬ 
lated,  but  usually  violated  in  heterogeneous  or  dense-target  environments.  In  the  latter  case,  the 
performance  of  all  training-based  detectors,  including  those  covariance-matrix  based  detectors 
discussed  in  Section  8.3,  degrade  considerably.  Training  screening  to  eliminate  “bad”  training 
data  in  such  cases  has  been  examined  in  a  number  of  recent  studies  for  radar  target  detection 
(e.g.,  [76]-  [78]  and  references  therein).  In  this  section,  we  discuss  screening  of  heterogeneous 
HSI  training  data.  Rather  than  treating  it  as  an  independent  process,  we  cast  training  screening 
within  the  proposed  NS-AR  framework. 

For  covariance-matrix  based  detectors  in  Section  8.3,  one  screening  approach  according  to 
statistical  ranking  and  selection  theory  is  to  compute  the  following  metric  from  the  training  set 
[76]: 

Tn  =  x^Rb  lxn,  n  =  l,...,N.  (8.36) 
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Then,  the  metric  is  used  to  partition  the  training  set  S  =  {xi, . . .  ,Xn}  into  two  disjoint  sets  Si 
and  S-2  (see  [76]  for  details),  of  which  the  former  contains  the  refined  training  data  while  the  latter 
contains  outliers  that  are  discarded. 

The  above  training  screening  approach  relies  on  an  estimate  of  a  full-rank  sample  covariance 

matrix  Rb.  To  circumvent  this,  we  note  that  xTn  Rh  xn  =  ||ccn||2,  where  x  =  Rb  ^  xn,  i.e., 
the  “whitened”  version  of  xn.  The  whitening  operation  can  be  equivalently  implemented  in  a 
time-series  fashion  by  an  MA  whitening  filter  without  the  need  to  estimate  Rb.  This  alternative 
screening  approach  is  proposed  in  [78]  and  referred  to  as  the  innovation  power  sorting  (IPS) 
method,  since  the  output  of  the  MA  whitening  filter  is  often  called  the  innovation  of  the  input 
(e.g.,  [79]). 

The  IPS  can  be  extended  and  cast  within  the  NS-AR  framework.  Specifically,  we  first  use  the 
ML  estimator  in  Section  8.4.3  to  estimate  the  NS-AR  parameters  (cq(m),  of}  from  the  original 
training  set  S.  Next,  we  form  a  shift-varying  MA  whitening  filter  from  these  parameter  estimates 
and,  similarly  to  (8.13),  whiten  the  training  set  as  follows: 


M 

Xn(l)  +  ai-Ls(m)xn(l  -  m)  \  , 

m= 1 

l  —  Ls  1, . . . ,  L  —  1;  n  =  1, . . . ,  N. 
Finally,  we  compute  the  following  metric 

L—l 

Tn  =  ®n(0>  n  =  l,...,N, 

l=Ls- 1 


(8.37) 


(8.38) 


which  is  used  to  replace  (8.36)  for  the  partition  of  S  into  <S]  and  S2. 


8.5  Experimental  Results 

In  this  section,  we  present  experimental  results  to  illustrate  the  performance  of  our  proposed 
techniques.  For  comparison,  we  consider  the  covariance  matrix  based  ACE  test  (8.7),  the  AR 
model  based  NPAMF  detector  (see  Section  8.4.1),  our  NS-AR  model  based  NS-NPAMF  detector 
(8.15),  and  a  modified  version  called  NS-LP-NPAMF  that  is  briefly  explained  below.  We  do  not 
compare  with  the  AMF  (8.5)  or  Kelly  (8.6)  tests  which  were  found  to  perform  similarly  to  the 
ACE  test  in  our  experiments.  Meanwhile,  the  ACE,  NPAMF,  NS-NPAMF  and  LS-NS-NPAMF 
are  all  normalized  tests  whose  test  statistics  range  between  0  and  1,  which  makes  comparison 
more  convenient. 

The  modification  made  in  the  NS-LP-NPAMF  detector  is  due  to  an  observation  that  HSI  spec¬ 
tral  data  exhibit  small  oscillations.  As  an  example,  Figure  8.2  depicts  the  original  HSI  data  of  a 
randomly  chosen  pixel  from  the  HSI  data  set  described  below.  Such  oscillations  along  the  spec¬ 
tral  dimension  do  not  contribute  much  to  detection,  meanwhile  making  parameter  estimates  more 
noisy.  It  was  found  that  passing  the  HSI  data  through  a  lowpass  (LP)  filter  to  first  remove  those 
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oscillations  before  applying  the  proposed  NS-AR  modeling,  estimation,  and  detection  techniques 
is  helpful.  Our  NS-NPAMF  detector  (8.15)  with  such  a  modification  is  called  NS-LP-NPAMF. 
For  lowpass  filtering,  we  use  a  simple  moving-average  filter  with  impulse  response  given  by  a 
Kaiser  window,  whose  length  is  equal  to  the  sliding  window  size  Ls  and  the  shape  parameter  is 
3.  It  should  be  noted  that  LP  filtering  is  applied  to  all  signals  involved  in  detection,  including  the 
training  pixels,  test  pixel,  and  target  signature. 

The  HSI  data  employed  in  our  studies  is  provided  on  the  CD  that  accompanies  [60].  Figure 
8.3  is  a  color  infrared  (IR)  image  from  a  portion  of  the  data  set,  which  shows  a  view  of  an  airborne 
hyperspectral  data  flightline  over  the  Washington  DC  area.  The  sensor  system  used  in  this  case 
measured  the  spectral  response  in  210  spectral  bands  in  the  0.4  to  2.4  /m i  region  of  the  visible 
and  IR  spectrum.  Bands  in  the  0.9  and  1.4  pm  region  where  the  atmosphere  is  opaque  have  been 
omitted  from  the  data,  leaving  L  —  191  spectral  bands.  Additional  information  on  the  data  set 
can  be  found  in  [60].  The  image  shown  in  Figure  8.3  was  made  using  bands  60,  27  and  17  for 
the  red,  green  and  blue  colors,  respectively.  Three  test  regions  are  highlighted  in  Figure  8.3.  Test 
region  #1  is  relatively  homogeneous  and  formed  by  grass,  test  region  #2  is  less  homogeneous 
with  tree  and  road,  and  test  region  #3  corresponds  to  a  heterogeneous  environment.  To  simulate 
the  Hi  condition,  we  superimpose  a  target  signal  to  the  test  pixel.  The  target  signal  corresponds 
to  the  spectral  signature  of  a  man-made  object  (taken  from  a  pixel  in  Figure  8.3),  and  is  scaled 
according  to  particular  target  fill  factors  [4].  Each  test  data  set  is  first  demeaned  using  a  3  x  3 
spatial  moving  average  filter  (see  [80]  for  details  on  the  demeaning  process). 

8.5.1  Model  Order  Selection 

We  first  use  the  GAIC  developed  in  Section  8.4.4  to  determine  the  model  order  M  of  the  NS-AR 
model.  Figure  8.4  depicts  W (M)  in  (8.32)  as  a  function  of  M  for  Ls  =  10  and  N  =  8,  and  the 
result  is  obtained  by  averaging  over  the  pixels  in  test  region  #1.  Results  obtained  with  the  other 
two  test  regions  are  similar.  It  is  seen  that  W (M)  decreases  quickly  as  M  increases  from  1  to 
3,  reaches  its  minimum  and  remains  relatively  flat  between  3  and  5,  then  increases  slightly  from 
5  to  8,  and  finally  drops  drastically  for  M  =  9.  The  pattern  of  decrease  followed  by  increase  of 
W(M)  is  standard  for  most  model  selection  techniques  [30].  To  understand  why  W(M )  drops 
again  at  M  =  9,  we  note  that  (8.28)  is  violated  with  M  =  9.  As  a  result,  Xt  in  (8.29)  does  not 
have  full  column  rank,  and  there  are  numerous  solutions  for  the  NS-AR  coefficients  {ai(m)}  that 
lead  to  zero  residual  in  the  NS-AR  model.  In  the  following,  we  choose  M  =  5. 

8.5.2  Detection  in  Homogeneous  Environments 

To  illustrate  detection  performance  in  homogeneous  environments,  the  figure  of  merit  employed 
here  is  the  separation  of  test  statistics  under  H0  and  Hi,  which  is  also  used  in  [4].  For  all 
methods,  we  use  N  =  8  training  pixels,  which  corresponds  to  a  3  x  3  region  without  counting 
the  center  pixel  (i.e.,  test  pixel),  for  sample  covariance  matrix  or  parameter  estimation.  The 
sample  covariance  matrix  R  is  rank  deficient  in  this  case.  As  suggested  in  [4],  we  use  the 
approximation  Rb  «  I  —  UiU1 ,  where  U i  is  formed  by  the  principle  eigenvectors  of  Ri,.  for 
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the  ACE  detector.  The  sub  vector  length  (i.e.,  sliding  window  length)  is  Ls  =  10  for  NS-NPAMF 
and  NS-LP-NPAMF. 

First  consider  test  region  #1.  Figures  8.5(a)  to  8.5(d)  depict  the  test  statistic  separation  of  the 
four  detectors,  respectively,  as  a  function  of  the  target  fill  factor.  We  note  that  NPAMF  is  the 
worst  of  all  detectors,  which  corroborates  our  earlier  observation  that  stationary  AR  modeling 
is  not  suitable  for  HSI  data.  However,  both  NS-NPAMF  and  NS-FP-NPAMF  outperform  the 
ACE  test,  with  NS-FP-NPAMF  being  slightly  better  than  NS-NPAMF.  Specifically,  we  see  that 
the  former  achieves  full  target-background  separation  when  the  fill  factor  is  0.25,  while  the  latter 
does  not. 

Figures  8.6(a)  to  8.6(d)  depict  the  counterpart  results  when  the  detectors  are  applied  to  test 
region  #2,  which  is  less  homogeneous  than  test  region  #1.  It  is  seen  that  all  four  detectors  expe¬ 
rience  some  degradation  relative  to  the  previous  results.  However,  the  proposed  NS-NPAMF  and 
NS-FP-NPAMF  detectors,  especially  the  latter,  still  significantly  outperform  the  others. 

8.5.3  Detection  in  Heterogeneous  Environments 

We  now  consider  detection  in  heterogeneous  environments.  To  this  end,  we  embed  5  targets 
at  randomly  chosen  locations  in  test  region  #3.  We  run  the  ACE  and  NS-FP-NPAMF  detec¬ 
tors  throughout  the  test  region  pixel  by  pixel,  with  and  without  training  screening.  If  training 
screening  is  not  applied,  we  use  the  N  =  8  pixels  surrounding  the  test  pixel  for  training.  Oth¬ 
erwise,  we  first  compute  metric  (8.36)  for  the  ACE  detector  and,  respectively,  metric  (8.38)  for 
the  NS-FP-NPAMF  detector  using  all  pixels  within  the  test  region,  and  then  the  metrics  are  used 
to  select  N  =  8  new  training  pixels  to  refine  the  parameter/covariance  matrix  estimate.  Figures 
8.7(a)  to  8.7(d)  depict  the  test  statistics  of  the  two  detectors,  with  and  without  training  screen¬ 
ing,  versus  the  index  of  the  pixels  within  the  test  region.  The  dotted  lines  in  these  plots  indicate 
the  indices/locations  of  the  embedded  targets.  By  comparing  the  results,  it  is  seen  that  train¬ 
ing  screening  helps  both  detectors.  It  is  also  seen  that  the  proposed  NS-FP-NPAMF  detector 
outperforms  the  ACE  detector  with  or  without  training  screening. 

Finally,  we  consider  a  dense-target  scenario  by  embedding  not  only  5  targets  but  also  outliers 
in  test  region  #1.  In  particular,  about  20%  of  the  pixels  at  random  locations  in  the  region  are  em¬ 
bedded  with  outliers  that  have  a  different  spectral  signature  from  that  of  the  target.  Figures  8.8(a) 
to  8.8(d)  show  the  test  statistics  of  the  ACE  and  NS-FP-NPAMF  detectors  with  and  without  train¬ 
ing  screening.  It  is  seen  that  the  NS-FP-NPAMF  detector  overall  achieves  a  better  performance 
than  the  other. 


8.6  Conclusions 

In  this  chapter,  we  have  exploited  parametric  modeling  of  HSI  data  and  investigated  its  applica¬ 
tion  for  subpixel  target  detection  in  HSI  systems.  We  have  shown  that  HSI  data  are  non- stationary 
in  the  spectral  dimension,  which  makes  parametric  adaptive  modeling  and  detection  more  chal¬ 
lenging  than  earlier  studies  for  stationary  data.  To  deal  with  non-stationarity,  we  have  proposed  a 
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Spectral  Bands  Spectral  Bands 

Figure  8.1:  Sample  estimates  of  the  autocorrelation  function  (ACF)  at  spectral  lag  0  (variance), 
lag  1,  lag  2  and  lag  3  across  the  spectral  bands. 

sliding-window  based  NS-AR  model  tailored  for  HSI  data.  We  have  developed  parametric  adap¬ 
tive  detectors  by  exploiting  the  NS-AR  model,  and  addressed  a  range  of  issues  including  model 
order  selection,  training  screening,  parameter  estimation,  time-series  based  signal  whitening,  and 
detection.  We  have  examined  the  performance  of  the  proposed  detectors  and  compared  with 
covariance-matrix  based  techniques  using  real  HSI  data.  It  has  been  shown  that  the  proposed 
parametric  detectors  are  more  efficient  in  training  data  usage  and  outperform  the  covariance- 
matrix  based  methods  when  training  is  limited. 

Our  approach  implicitly  assumes  that  HSI  data  is  spectrally  correlated.  In  most  cases,  HSI 
sensors  oversample  the  spectral  signal  [3],  which  brings  in  spectral  correlation  in  HSI  data.  The 
covariance-matrix  based  detectors,  however,  can  be  applied  in  the  absence  of  spectral  correlation 
(as  in  earlier  multispectral  systems  with  a  few  spectral  bands  [65]).  While  the  covariance-matrix 
based  AMF,  ACE  and  GLRT  detectors  have  a  CFAR  behavior,  it  is  unclear  whether  the  proposed 
detectors  retain  the  same  property.  This  remains  an  issue  to  be  examined  in  the  future.  Other 
research  along  the  proposed  direction  includes  analytical  study  of  the  proposed  detectors  and 
exploration  of  alternative  non- stationary  parametric  models  for  HSI  target  detection. 
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Figure  8.2:  Original  (solid  line)  and  smoothed  (dotted  line)  HSI  data. 


Figure  8.3:  HSI  image  of  the  Washington  DC  Mall  with  L=191  spectral  bands.  Three  test  regions 
are  highlighted  in  yellow. 
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Figure  8.4:  NS-AR  model  order  selection. 
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(c)  (d) 

Figure  8.5:  Test  region  #1:  target-background  separation  versus  target  fill  factor,  where  the  red 
(dark)  bars  correspond  to  the  range  of  test  statistics  under  H\,  while  the  green  (light)  bars  show 
the  counterpart  under  H0.  (a)  ACE.  (b)  NPAMF.  (c)  NS-NPAMF.  (d)  NS-LP-NPAMF. 
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Figure  8.6:  Test  region  #2:  target-background  separation  versus  target  fill  factor,  where  the  red 
(dark)  bars  correspond  to  the  range  of  test  statistics  under  Hi,  while  the  green  (light)  bars  show 
the  counterpart  under  H0.  (a)  ACE.  (b)  NPAMF.  (c)  NS-NPAMF.  (d)  NS-FP-NPAMF. 
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(a) 


(b) 


(c) 


(d) 


Figure  8.7:  Test  statistics  of  ACE  and  NS-LP-NPAMF  of  test  pixels  in  the  test  region  #3  with 
5  embedded  targets,  (a)  ACE  without  training  screening,  (b)  NS-LP-NPAMF  without  training 
screening,  (c)  ACE  with  training  screening,  (d)  NS-LP-NPAMF  with  training  screening. 
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(a) 


(b) 


(c) 


(d) 


Figure  8.8:  Test  statistics  of  ACE  and  NS-LP-NPAMF  of  test  pixels  in  the  test  region  #1  with 
5  embedded  targets  and  more  than  20%  of  the  pixels  are  embedded  with  outliers,  (a)  ACE 
without  training  screening,  (b)  NS-LP-NPAMF  without  training  screening,  (c)  ACE  with  training 
screening,  (d)  NS-LP-NPAMF  with  training  screening. 
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Appendix  A 


Parametric  Rao  Test 


A.l  ML  Parameter  Estimation 

In  the  following,  we  derive  the  ML  estimates  of  the  nuisance  parameters  Q  and  (A(p)}p=1  or  A 
defined  in  (3.15)  under  H0,  which  will  be  needed  in  the  derivation  of  the  Rao  test  in  Appendix 
A. 2.  The  joint  PDF  or  likelihood  function  Y[k  fi(xk( 0),  xfc(l),  •  •  •  ,  xfc(iV  —  1);  a,  A,  Q)  under 
Hi,  i  =  0  or  1,  can  be  written  as 

JJ/i(xfc(P),xfc(P  +  1),  —  ,xfc(A-  l)|xfc(0), 

k 

x(l),---  ,xfc(P-l);A,Q)  (A'1} 

x/(xfc(0),  Xfc(l),  •  •  •  ,xfc(P-  1);  at.  A,  Q). 

The  exact  maximization  of  the  PDF  with  respect  to  the  unknown  parameters  produces  a  set  of 
highly  nonlinear  equations  that  are  difficult  to  solve.  For  large  data  records,  the  likelihood  func¬ 
tion  can  be  approximated  well  by  the  conditional  PDF  in  the  above  equation  [35]  and,  therefore, 
the  latter  can  be  used  for  parameter  estimation.  After  some  manipulations  using  the  standard 
procedure  for  obtaining  the  PDF  of  a  set  of  transformed  random  variables,  we  have 


/i(xfc(P),xfc(P+  1),  —  ,Xj k{N  -  l)|xfc(0),xfc(l), 
•••  ,xfc(P-  1);  of,  A,Q) 


N—l 


=n 


i 

vrJ|Q 


exp{-ef(n)Q  1ek(n)}  . 


(A. 2) 


where  for  k  >  1,  £k(n)  is  a  function  of  the  observed  signals  given  by  (7.3),  whereas  s0 (n)  is 
given  by  (3.8)  or  (3.11)  with  a  =  0  when  i  —  0  and  a^O  when  i  —  1. 

Recall  that  the  training  signals  {x/J  jL ,  and  the  test  signal  x0  are  independent.  Let  X(n)  = 
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T 

Xq(ti),  xf(n),  x^(n),  •••,  x^-(n)  .  The  joint  conditional  PDF  is  given  by 


/i(X(P),  X(P  +  1),  •  •  •  ,  X(N  -  1)|X(0),  X(l), 
■■■  ,  X(P  —  1);  a,  A,  Q) 

=/*(«>  A,  Q) 


(A. 3) 


yrJ|Q 


exp{-tr(Q  1Qi(A)) } 


(A+l)(iV-P) 


where  in  the  first  equality  we  dropped  the  dependence  on  the  observed  signals  for  notational 
brevity, 


N—l  K 


Q*(a,  A)  = 


(K+1)(N-P) 


J2J2£k(n)£k(n), 


(A.  4) 


n=P  k= 0 


and  we  reiterate  that  a  =  0  for  i  =  0,  a  ^  0  for  i  =  1,  and  e0(n)  depends  on  a  as  shown  in  (3.8) 
or  (3.11). 

The  Rao  test  requires  the  ML  estimates  of  the  nuisance  parameters  under  H0.  Henceforth,  we 
only  consider  the  case  i  —  0.  Taking  the  derivative  of  In  /0(A,  Q)  with  respect  to  Q  and  equating 
the  result  to  zero  produces  the  ML  estimates  of  Q(A)  given  A: 1 


Qml(A)  —  Qo(A). 


(A. 5) 


Substituting  QMl(A)  into  /0( A,  Q),  we  have 


max  /0(A,  Q) 

Q 


(K+1)(N-P) 


(e7r)J|Q0(A)|  J 


(A. 6) 


Next,  we  determine  the  ML  estimates  of  A.  Since  maximizing  /0(A)  is  equivalent  to  min¬ 
imizing  |Qml(A)|,  or  |  Q0  (A)  | ,  the  ML  estimate  of  A  can  be  obtained  by  minimizing  |Q0(A)| 
with  respect  to  A.  We  next  expand  the  matrix  as  follows: 


(K  +  1)(N-  P)Qo(A) 

=  R-a-x  +  AHKyx  +  R^.A  +  AHTtyyA. 

=  (A"  +  R»R„‘)  R„  (A«  +  R« R-,1)" 


(A.7) 


where  the  correlation  matrices  are  defined  in  (3. 17)-(3. 19).  Since  Rra  is  non-negative  definite 
and  the  remaining  terms  in  (A.7)  do  not  depend  on  A,  it  follows  that2 


Qo(A)>Q0(A)|a=a,  (A. 8) 

'Since  a  =  0  for  i  =  0,  the  dependence  on  a  is  dropped. 

2For  two  non-negative  definite  matrices  A  and  B,  we  have  A  >  B  if  A  B  is  non-negative  definite. 
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where 


A"  =  -R"R-'.  (A.9) 

When  Qo(A)  is  minimized,  the  estimate  AH  of  AH  will  minimize  any  non-decreasing  function 
including  the  determinant  of  Qo(A)  [41].  Hence,  the  ML  estimate  A(|L  of  A,r  is  given  by  (A.9) 
or  (3.20),  and  QMl  is  given  by  (3.21),  which  is  obtained  by  replacing  AH  in  (A.7)  with  A^L. 
The  subscript  “ML”  is  dropped  in  other  parts  of  the  paper  for  notational  brevity. 


A.2  Derivation  of  the  Parametric  Rao  Test 


The  composite  hypothesis  testing  problem  (3.3)  involves  a  signal  parameter  vector  6,  =  \aR,  aj\ 1  = 
[K{a},  S{a}]T  and  a  nuisance  parameter  vector  6S  that  includes  all  unknown  parameters  in 
(A//(p)}p=1  andQ.  The  nuisance  parameter  vector  may  be  written  as  =  [q^,  qj,  a]-,,  a]]  7 

with  a/j  =  vec  (K{AH}),  a /  =  vec  (^{A11}),  contains  the  diagonal  elements  in  Q  and  the 
real  part  of  the  elements  below  the  diagonal,  while  q/  contains  the  imaginary  part  of  the  elements 
below  the  diagonal  (note  that  the  spatial  covariance  matrix  Q  is  a  Hermitian  matrix).  Let 


o=[fr,  eTs] 


TiT 


(A.  10) 


Observing  that  the  nuisance  parameters  are  the  same  under  both  hypotheses,  we  can  write  the 
parameter  test  as  follows: 


_  f)  q 

r  uro  5 


H0 

H\  .  0r  0 ,  6S1 

i  T 


(A.ll) 


where  0 ro  =  [0,  0]  and  6ri  =  0,  =  \oir ,  «/]  .  The  PDF  under  H0  and  the  PDF  under  //, 
differ  only  in  the  value  of  6 ,  and  they  are  given  by  (see  Appendix  A.l): 


m  = 


yrJ|Q 


exp{-tr(Q  A)) } 


(K+1)(N-P) 


where  Qj(a,  A)  is  defined  in  (A. 4).  The  Rao  test  is  given  by  [7] 


din  f(0) 


d  6, 


3~\G) 


o=o 


din  1(6) 


Or,Or 


86, 


% 

0=0  Ho 


7Rao, 


where  7Rao  denotes  a  corresponding  threshold, 


e  = 


rp  *■  T 

C  dso 


1  T 


denotes  the  ML  estimate  of  6  under  H0,  and 


-1 


(A.l  2) 


(A.  13) 


(A.  14) 
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which  is  related  to  the  Fisher  information  matrix  (FIM),  given  by  [7] 


m 


J  or,or{0)  J  er,os{Q) 
J  es,er(0)  J  Os,es{0) 


(A. 15) 


Hence,  the  problem  boils  down  to  finding  the  ML  estimates  of  the  nuisance  parameters  under  H0, 
which  have  been  obtained  in  Appendix  A.l,  and  evaluating  the  first  order  derivatives  of  the  log 
likelihood  and  the  FIM  at  the  ML  estimates  of  the  nuisance  parameters.  The  latter  task  is  worked 
out  next. 

The  FIM  is  block  diagonal.  To  see  this,  let  qR.,  qT.,  aRi,  and  aJt  denote  the  i-th  element  of 
qR,  q/,  a/,,  and  a/,  respectively.  The  first  partial  derivative  of  the  log  likelihood  In  /  with  respect 
to  (w.r.t.)  aR  is 

(91  f  JV_1  7V_1 

s^HQ_1e0(n)  +  ^2  £o  (^)Q-1s(n).  (A.16) 

^  n=P  n=P 

The  second  partial  derivative  of  In  /  w.r.t.  aR  and  qR.  becomes 


Likewise,  we  have 


and 


d2  In  / 
daRdqR. 


'  N-l 


~m\^2£o(n) Q  , ^  Q  ls(n 


, n=P 


Si— 


n 


,  n=P 


d2  In  / 
daRdaRi 

=  2  E  E  » { wq-'- _  p) 


n=P  p=  1 


+sH(n)Q 


_1dAH(p) 


da, 


R: 


daRi 

x0(n  —  p)  —  as  (n  —  p ) 


(A.  17) 


(A.  18) 


(A.  19) 


Since  77[x0(n)—as(n)]  =  E[x0(n—p)—as(n—p)\  =  0  and  E[e0(n)]  =  0,  taking  the  expectation 
in  (A.17)-(A.19)  yields 


E 


d2  In  / 
daRdqRi 

In  a  similar  way,  we  can  show 

E 


=  E 


=  E 


d2  In  / 
[daRdqh 


=  E 


d2  In  / 
daRdaRi 


=  0. 


"  d2  In  /  ' 

—  p 

"  d2  In  /  ' 

—  p 

"  d2  In  /  ' 

_daIdqRi_ 

—  11/ 

/)ardqh_ 

—  11/ 

daRdaR 

'  d2  In  /  ' 

~  p 

'  d2  In  /  ' 

_daIdaRi_ 

—  11/ 

daida R 

=  0. 


(A. 20) 


(A. 21) 
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Summarizing  the  above  calculations,  we  have 


Jor,os{0)  —  0,  Jos,or{0)  —  0, 


which  implies  that  the  FIM  is  block  diagonal.  It  follows  that 


J-1^) 


6r,6r 


Hence,  we  only  need  to  compute  J9^e(0),  which  is  obtained  next. 

The  second  partial  derivative  of  In  /  w.r.t.  a.R  is 

02  1  f  N-l 

n=P 

Likewise,  we  have 

02  ]„  f  N~1 

=  -2  sff(n)Q"1s(n) 

1  n=P 

and 

d2  In/  d2  In  / 

- —  =  - — 2-  =  0. 

daedal  daidan 

As  a  result,  we  have  the  FIM  associated  with  the  signal  parameter  vector: 


N- 1 

J oT,eT{0)  =  2  §H(n)Q_1s(n) 

n=P 


1  0 
0  1 


(A. 22) 

(A. 23) 

(A.24) 

(A. 25) 

(A. 26) 

(A. 27) 


Finally,  by  inverting  the  matrix  (A. 27)  and  replacing  0  with  0  which  is  the  ML  estimate  of  6 
under  H0,  we  have 


1 


2E 


N- 1 
n=P 


s^(n)Q-1s(n) 


1  0 
0  1  ’ 


(A.28) 


where  Q  is  the  ML  estimate  of  the  spatial  covariance  matrix  in  (3.21),  and  s (n)  is  the  temporally 
whitened  steering  vector  in  (3.13).  Moreover,  since  e0(n) \9=9  =  x0(n),  we  have 


d  In  / 


dotR 
d  In  / 


JV-1 


=  ^|x^(n)Q  1s(n)  +  sH(n)Q  1x0(n) 

0=0  n=P 


Oct; 


jv-i 


0=0 


=  j  (n)Q  xs(n)  +  sH(n)Q  1x0(n)| . 


n=P 


(A. 29) 
(A. 30) 


Using  (A.28)-(A.30)  in  (A. 12)  yields  the  parametric  Rao  test,  which  is  shown  in  (7.14). 
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A.3  Asymptotic  Distribution  of  the  Parametric  Rao  Test  Statis¬ 
tic 


The  Rao  test  is  known  to  have  the  same  asymptotic  performance  as  the  GLRT.  Using  the  asymp¬ 
totic  results  for  the  GLRT  [7],  the  asymptotic  distribution  of  our  parametric  Rao  test  statistic  is 
given  by 


Ti 


Rao 


x;2(A), 


under  H0 , 
under  Hi , 


(A.31) 


where  xi  denotes  the  central  Chi-squared  distribution  with  2  degrees  of  freedom  and  y22(A)  the 
non-central  Chi-squared  distribution  with  2  degrees  of  freedom  and  non-centrality  parameter  A: 


A  =  (0ri  -  ero)T  (  [J  1  ([0ro,  0S])]  0  0J  (0ri  -  Oro) 


Using  the  observations  6ri  —  Oro  =  [aR,  ap\  and  (cf.  (A.27)) 

1 


[J-1  ([0.O,0S])] 


0r, 0r 


2En=p§^(n)  Q_1s(n) 


1  0 
0  1 


(A. 32) 


(A.33) 


we  have  the  asymptotic  distribution  of  the  parametric  Rao  test  statistic  as  shown  in  (3.22). 


A.4  Performance  of  the  MF  and  AMF  Detectors 


The  performance  of  the  MF  and  AMF  detectors  can  be  computed  analytically.  In  this  appendix, 
we  include  a  brief  summary  of  their  performance  for  easy  reference. 

Consider  the  MF  detector  (8.3)  first.  Let  Rr1/2  be  the  square-root  of  the  space-time  covari¬ 
ance  matrix  R.  Define  s  =  R,  !/-s  and  x0  =  R~ 1 /2x0,  which  are  the  spatially  and  temporally 
whitened  steering  vector  and  test  signal,  respectively.  Since  the  rank  of  ssH  is  one,  we  have  the 
following  eigen  decomposition: 

ssH  =  UAU^,  (A. 34) 

where  A  =  diag  ( sHs ,  0,  •  •  •  ,  0)  and  Ui2U  =  I.  Let  x0  =  U^x0  and  s  =  U2/s,  which  are 
rotated  versions  of  x0  and  s,  respectively.  Then,  the  test  statistic  can  be  written  as 


_  sH±0  2  x?UAUHx0  ._  l2 

Tmf  ~  sHs  ~  “  xo,ixo,i  -  ko.il  , 


(A. 35) 


where  x0,i  and  si  denotes  the  first  element  of  x0  and  s,  respectively.  It  is  clear  from  Assumptions 
AS1  to  AS3  in  Section  7.2  that  x0,i  is  a  complex  Gaussian  variable:  x0,i  ~  CJ\f(asi,  1)  with 
a  =  0  under  H0  and  a  ^  0  under  Hi.  Hence,  2TMf  =  2\xo,i\2  has  a  central  Chi-squared 
distribution  with  2  degrees  of  freedom  under  H0  and,  respectively,  a  non-central  Chi- squared 
distribution  with  2  degrees  of  freedom  and  a  non-centrality  parameter  Amf  =  2|asi|2  under  Hi. 
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It  is  noted  that  the  distribution  of  the  MF  test  statistic  is  similar  to  that  of  the  parametric  Rao 
test  statistic  with  the  only  difference  of  the  non-centrality  parameter  under  H\ .  Hence,  the  false 
alarm  and  detection  probabilities  can  be  similarly  computed  as  in  (3.27)  and  (3.28). 

The  performance  of  the  AMF  detector  (8.5)  was  analyzed  in  [12],  which  is  summarized  below. 
The  density  of  a  loss  factor  p,  which  was  defined  in  (25)  of  [12],  is  given  by 

f(p)  =  fp(p-,L^l,JN-  1),  (A. 36) 


where  L  —  K  —  JN  +  1  and  the  central  Beta  density  function  is 


fp(x',  n,  m)  = 


(n  +  m-  1)!  x  ! 


-xn-\i-xy 


( n  —  1  )!(m  —  1)! 

The  probability  of  false  alarm  is  given  by 

-1  fp(r,L-i,jN-i) 


(A. 37) 


f,  AMF  — 


(1  +  T]p)L 


dp, 


(A. 38) 


where  r/  =  7Keiiy/ (1  —  7Keiiy)  and  7Keiiy  is  the  test  threshold  of  Kelly’s  GLRT  (8.6).  Meanwhile, 
the  probability  of  detection  is  given  by 


Pc 


d,  AMF 


=  1 


E 


x  (  Wr  a 


o  (1  +  VP)L  “  \m 


(A. 39) 


1  +  Tj  p 


f(p)dp , 


where  ^  =  s^R  xs  and  Gm(-)  is  the  incomplete  Gamma  function  given  by 


m—  1  u 

G.(»)  =  e-*Esr 

k= 0 

The  integrals  can  be  computed  by  numerical  integration. 


(A. 40) 
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Appendix  B 

Parametric  GLRT 


B.l  ML  Parameter  Estimation 


In  this  appendix,  we  develop  the  ML  parameter  estimators  under  both  hypotheses.  Recall  that  the 
likelihood  functions  under  both  hypotheses  differ  only  in  the  value  of  a,  that  is,  a  =  0  under  //0 
and  a  /  0  under  II \ .  We  will  show  that  the  ML  estimates  under  7/0  can  be  obtained  by  setting 
a  =  0  in  the  ML  estimates  under  H\ . 

Let  Xfc(n;  A)  denote  the  temporally  whitened  version  of  x/,.(n): 

p 

Xfc(n;  A)  =  xfc(n)  +  AH(p)xfc(n  - p).  (B.l) 

p=i 

Conditioned  on  the  first  P  values  {xfc(n)}^~0,  k  =  0, 1, . . . ,  K,  the  log-likelihood  function  is 
proportional  to  (within  an  additive  constant)  [37] 

K  N- 1 

-L  In  |Q|  -  A)Q _1Xfc(n;  A) 

k= 1 n=P 

N~l  (B.2) 

_  (n;  A)  -  as(n ;  A)}fi  Q_1 

n=P 

x  (x0(n;  A)  —  as(n ;  A)}  . 


It  is  noted  that  for  the  large-sample  case,  the  likelihood  function  can  be  well  approximated  by 
the  above  conditional  distribution  [30].  We  therefore  use  (B.2)  for  ML  estimation.  Taking  the 
derivative  of  (B.2)  with  respect  to  Q  and  equating  it  to  zero  produce  the  ML  estimates  of  Q 
conditioned  on  a  and  A: 


N-l  K 


Q  (“.A)  =  I-ES[: 

n=P  k=  1 


xfc(n;  A)xf(n;  A)  + 
(x0(n;  A)  —  as(n ;  A)}  (x0(n;  A)  —  as(n ;  A)}H 


(B.3) 
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Substituting  the  above  Q(a,  A)  back  in  (B.2),  we  find  that  maximizing  (B.2)  reduces  to  min¬ 
imizing  |Q (a,  A)|.  Therefore,  the  ML  estimates  of  a  and  A  can  be  obtained  by  minimizing 
|Q(a,  A)  |  with  respect  to  a  and  A.  In  turn,  we  can  get  the  ML  estimate  of  Q  by  replacing  a  and 
A  with  their  ML  estimates  in  (B.3).  Next,  observe  that 

LQ(a,A) 

—  Rr^oO  +  A^R ■yx{oi)  +  R-^(Qi)A  +  A^Ry.j/(a)  A 

=  (A"  +  R^MR^o;))  Rw(a)  (B.4) 

x  (a" +  RjT(a)RTO1(a)J 
+  R-xx^)  —  ^yxi^^yy  (a^Ry^a;), 

where  the  a-dcpcndcnt  correlation  matrices  are  defined  in  (4.10)-(4.12).  Since  R,/;/(a:j  is  non¬ 
negative  definite  and  the  remaining  terms  in  (B.4)  do  not  depend  on  A,  it  follows  that1 

Q(a,A)>Q(a,A)|A,A(a)f  (B.5) 

where 

A"(o)  =  -R?»R-».  (B.6) 

When  Q(ct,  A)  is  minimized,  the  estimate  A  (a)  of  A  will  minimize  any  non-decreasing  function 
including  the  determinant  of  Q(a,  A)  [41].  It  should  be  noted  that  in  finding  the  estimate  of  A, 
we  did  not  impose  the  constraint  that  the  underlying  AR  process  is  stable  for  the  sake  of  obtaining 
a  simple  solution.  Hence,  the  unconstrained  ML  estimate  of  A  and  Q  conditioned  on  a  are  given 
by  (4.17)  and  (4.18),  respectively. 

Replacing  A  in  (B.4)  by  A(ct)  followed  by  minimizing  |Q(a,  A(a))|  yields  the  ML  am¬ 
plitude  estimator  of  a  given  by  (4.9).  Once  the  ML  estimate  «ML  of  the  signal  amplitude  a  is 
obtained,  substituting  dML  in  (4.17)  and  (4.18)  yields  the  ML  estimates  of  A  and  Q  under  H\ , 
which  are  given  by  (4.15)  and  (4.16),  respectively. 

Since  a  =  0  under  H0,  substituting  a  =  0  in  (B.5)  and  (B.6)  leads  to  the  ML  estimates  of  A 
and  Q  under  H0,  which  are  given  by  (4.20)  and  (4.21),  respectively. 

'For  two  non-negative  definite  matrices  A  and  B,  we  have  A  >  B  if  A  —  B  is  non-negative  definite  [41]. 
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B.2  Derivation  of  the  AML  Estimator 


Using  definitions  in  (4.26)-(4.29),  (4.22)  can  be  written  as 


X„-aS  Xn-«S 


E 


=  (x„  -  ah)  (P  +  P±)  (io  -  a§)  +  EXiX 


=  (XnP-aS)  (XoP-aS) 


=  (XnP-aS)  (XnP-aS)  + 


Next,  observe  that  minimizing 


C2(a)  =  (XoP-aS)  (X0P-aS)  1  + 1  , 


is  asymptotically  equivalent  to  minimizing  [39,41]: 


CM  =  tr  <M  Xr,P  -  aS  V  1  XnP  -  aS 


which  is  a  quadratic  function  in  a.  Minimizing  (B.9)  with  respect  to  a  leads  to  the  AML  estimate 
cIaml  given  by  (4.25)  (also  see  [39]). 


B.3  Unbiasedness  and  Consistency  of  the  LS  Estimator 


First,  note  that 


-E^ls]  —  E 


s^x0]  [  s^dc 

— —  —  E  a-\ - tt- 


which  indicates  that  the  LS  estimator  is  unbiased.  Moreover,  the  variance  is  given  by 


var[dLs]  =  E 


s^doX  fsHd0\H  swRs 

s^s  )  I  s^s  )  (sHs)2' 


(B.10) 


( B  .11) 


Next,  we  show  that  the  variance  vanishes  as  the  number  of  observations  N  increases.  Note  that 
the  numerator  can  be  written  as 


sffRs  =  s^R1/2R1/2s, 


(B.12) 


where  R1/2  denotes  the  Hermitian  square-root  of  R.  It  is  known  that  multiplying  by  R1  2  is  a 
coloring  linear  transform.  Under  Assumption  AS4,  such  a  coloring  transform  using  the  square- 
root  of  the  joint  space-time  covariance  matrix  R  is  asymptotically  equivalent  to  a  cascade  of  an 
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AR  filer,  which  performs  temporal  coloring,  followed  by  a  spatial  coloring  filter  [30].  As  such, 
(B.12)  can  be  approximated  (for  large  N )  as 

N—l 

shRs  ~  ^  sH(n)Qs(n ),  (B.  13) 

»i=0 

where  s (n)  6  CJxl  denotes  the  output  of  the  multichannel  AR  filter  as  specified  in  AS4,  given 
the  input  signal 

Let  the  eigenvalue  decomposition  of  Q  be  expressed  as:  Q  =  UAUF,  where  A  is  a  diagonal 
matrix  containing  all  eigenvalues  and  U  is  composed  of  the  corresponding  eigenvectors.  Let  Amax 
denote  the  largest  eigenvalue  of  Q.  We  have 


It  follows  that 


N-l 


N—l 


sH (n)Qs(n)  <  Amax  ^  sH (n)XJXJH s(n) 

n= 0  n=0 


N-l 

=  Amax^  ||s^(n)||2. 

n= 0 


var[dLS]  < 


(B.14) 


(B.  15) 


since  (sHs)2 


/  yN~l 
\  Ln=0 


sin 


Assuming  that  the  AR  filter  is  stable,  we  have  (e.g.,  [81]): 


N-l  N-l 

J2  pwii2  <  CY1  iis(n)ii2  (B16> 

n= 0  n= 0 

for  some  bounded  constant  C.  Hence,  for  a  given  AR  filter  and  spatial  covariance  Q,  the  right- 
hand  side  of  (B.  15)  vanishes  as  N  goes  to  infinity.  This  proves  that  the  LS  amplitude  estimate  is 
statistically  consistent. 


B.4  Derivation  of  CRB 


Let  e  =  [el, eTs]T ,  where  Gr  =  pR{a},  Gr,  and  6S  contains  all  nuisance  parameters  in 
(AiJ(p)}p=1  and  Q.  It  is  shown  in  [37]  that  the  Fisher  information  matrix  (FIM)  for  6  is  block 
diagonal  with  respect  to  6,  and  6S.  Therefore,  the  CRB  for  the  signal  amplitude  estimate  is  given 
by  the  FIM  associated  with  6r,  which  is  given  by  [37] 


mer 


,0r 


"  N-l 

2  ^  s H(n;  A)Q_1s(n;  A) 

n=p 


h- 


(B.  17) 


By  inverting  (B.  17)  and  using  CRB(a)  =  CRB(K{a})  +  CRB(Q:{q:}),  we  have  the  CRB  given 
by  (4.32). 
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B.5  Asymptotic  Distribution  of  the  Parametric  GLRT  Statis¬ 
tic 


Using  the  asymptotic  results  for  the  GLRT  [7],  the  asymptotic  distribution  of  our  parametric 
GLRT  statistic  is  given  by 


T, 


GLRT 


under  H0, 
under  H\ , 


(B.18) 


where  the  non-centrality  parameter  A  is  given  by 


a  =  (eri  -  ejT([J-‘  ([»,„, «J)]9ri9X‘  (»r,  -  »„) 


(B.  19) 


where  9ro  and  9ri  are  6r  under  H0  and  Hi,  respectively;  [J  1  ([ 6ro ,  0s])]e  0  is  the  2x2  upper- 
left  partition  of  J  1  ([0,.o,  0S]).  Using  the  observations  9ri  —  9ro  =  a/ ]  and  (cf.  (B.17)) 


[j-HlOroM)] 


r  N—i 


-i 


2  ^  s H(n\  A)Q  1s(n;A) 

n=P 


(B.20) 


we  have  the  asymptotic  distribution  of  the  parametric  GLRT  statistic  as  shown  in  (4.36). 
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Appendix  C 

A  Simplified  Parametric  GLRT 

C.l  Derivation  of  (5.23) 


Starting  from  (5.14),  the  determinant  of  R  (a)  can  be  written  as 

K 

(X0  -  aS)  (X0  -  aSf  +  ^  XfcXf 

k-- 

(X0p 5  -  aS)  (X0P5  -  aS) 


fc=i 


A' 


X0P^X0H  +  ^X,X 


fc=l 


;x0ps  -  aS)  (X0P5  -  «S)H  R*1  +  I 


R 


■x 


(C.l) 


Consider  the  idempotent  matrices  P5  and  P  s  and  assuming  the  number  of  sample  data  is  large 
enough,  i.e.,  iV  S>  1,  we  have 


rank  (P5)  <  J(P  +  1),  and  rank  (P^)  >  N  —  P 
where  rank(-)  denotes  the  rank  of  a  matrix.  Then,  we  have  [39] 


(C.2) 


(X0P5  -  aS)  (X0Ps  -  aSf  R^1 


=0 


N-P 


<  1. 


(C.3) 
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Let  {Xm}^=1  denote  the  eigenvalues  of  the  matrix  (C.3),  which  satisfies  0  <  Xm  «  1  according 
to  (C.3).  Then 

(X0P 5  -  aS)  (X0P5  -  aS)H  Ry  +  I 

M 

=  (1  +  Xm) 

m= 1 

M 

«i  +  X] Xm 

m=  1 

=  1  +  tr  [(X0PS  -  aSf  R*1  ( XQPS  -  aS)]  ,  (C.4) 

where  the  approximation  (a)  holds  in  a  first-order  sense.  Similarly,  the  determinant  of  R,y;y  (a) 
can  be  expressed  as 

Ryjz  (a) 

=  (Y0Pt  -  aT)  (Y0Pr  -  aTf-Ry1  +  l|  •  IrJ,  (C.5) 

and 

(Y0Pr  -  «T)  (Y0Pr  -  aT)H  Ry1  +  I 

«1  +  tr  {(Y0Pr  -  oT)H  R-.1  (Y0Pr  -  aT)}  .  (C.6) 

Then,  combining  (C.l),  (C.4)-(C.6)  and  ignoring  the  items  independent  of  a  result  in  the  asymp¬ 
totically  equivalent  expression  in  (5.23). 

C.2  Derivation  of  the  Amplitude  Estimator 

Following  Appendix  I  and  noting  that 

tr  {(X0P5  -  aSf  R^1  (X0PS  -  aS)}  «  1,  (C.7) 

and 

tr  |  (Y0Pr  -  aT)H  R"1  (Y0PT  -  aT) }  <  1,  (C.8) 

we  can  approximate  (5.23)  as 

Pi  (a) 

=  tr  | (X0P s  ~  aS  ffix1  (X0P5  -  aS)} 

-  tr  { (Y0P x  -  aT)H  Ry1  (Y0Py  -  aT) }  ,  (C.9) 
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where  the  approximation  ln(l  +  x)  ~  x,  for  x  <C  1,  was  invoked.  The  cost  function  F\  (a)  is  a 
quadratic  function  with  respect  to  a.  It  is  easy  to  show  that  minimizing  (C.9)  with  respect  to  a 
leads  to  the  AML2  amplitude  estimate  a  given  by  (5.27). 


C.3  Derivation  of  The  New  Parametric  GLRT 

Using  the  Schur  complements,  we  can  write  (7.6)  as 

Qml,o 
In  - - 

Qml,i 

X0X^  +  E  xfexf 

i  k=  1 

=  1“ - K - 

Y„Y«  +  E  YtY« 

k= 1 

(X0  -  dMLS)  (X0  -  &mlS)H  +  E  XkX" 

1  k=l 

~  - 

(Y0  -  omlT)  (Y„  -  dMLT)"  +  E  YtYf 

k= 1 

X0P  sXq  R^1  +  I 
oc  In - 

YoP,Y"RE  + 1 

(X0P s  -  aMLS)  (X0P s  -  R v1  +  I 

—  In - - . 

(Y0Pt  -  «mlT)  (Y0Pt  -  aMhT)H  Ry1  +  I 

The  RHS  of  the  above  equation  can  be  further  simplified  using  asymptotic  approximations  [see 
(C.4)  and  (C.5)] 

Qml,o 
In  - - 

Qml,i 

oc  tr  {pfX^R^XoP s}  -  tr  {p^Y^R^YoPt} 

—  tr  j(X0Ps  —  Ry1  (XoPs  —  ^mlS) j 

+  tr  |(Y0Pt  —  «mlT)h  Ry1  (Y0Pr  —  AmlT)  j  . 
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Replacing  the  exact  ML  estimate  with  the  AML2  amplitude  estimation  results  in  the  approximate 
parametric  GLRT. 


GLR 


tr  (s^R^Xo)  -  tr  (t^R^Yo) 
tr  (s^R^s)  -  tr 


which  is  the  matrix  form  of  (5.28). 


(C.10) 


C.4  Alternative  Form  of  The  Parametric  GLRT 

Let 

Si  =  [s(P),s(P  +  l),--  -  ,s(N-l)]  eCJx(A,-p)  (C.ll) 

and  X/,.j  G  CJx('N~p^  is  similarly  defined.  The  matrix  S  can  be  rewritten  as  SH  =  [  TH ,  Sf  ] , 
where  T  G  (^JPx(n-p)  is  given  by  (5.21).  By  invoking  the  formula  of  the  block  matrix  pseudo¬ 
inverse  [82],  we  have 


P£  =1  -  [  TpSf  } 

(Tp)t  _  (T^)tSf  (0  +  D) 
(Ct  +  D) 


x 


where 


and 


C  =  (Ijv-p  -  TH  (T Hy)  Sf  G  C(JV"P)xJ, 


D  =  (Ij  -  0C)  [ij  +  (I7  -  CtC)  SxTtSf 
x  (Ij  -  etc) 


H 


-1 


X 


SiTf  (Ijv-p  -  Sf  Cf)  g  CJx(N~p). 

Expanding  (C.12)  yields 

=  Pf  (I  -  Sf  (Cf  +  D))  =  Pf  (I  —  E) . 
From  (5.24)  and  (5.25),  the  RY  can  be  rewritten  as 


R 


■x  — 


R  y  I  Rx,2 

Rfl  Rx,3 


(C.12) 


(C.13) 


(C.14) 


(C.15) 


(C.16) 
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where 


Ry,i  =Ry-Y0P£EY* 

K 

(C.17) 

Ry,2  =Y0Py  (I  -  E)  X"  +  V  YfcX*, 

(C.18) 

k=  1 

K 

Ry,3  =X0jP^  (I  -  E)  X* 

+  XX‘.'X?r 

(C.19) 

k=  1 


Applying  the  block  matrix  inversion  lemma  twice,  first  on  Rj  and  then  on  R^i,  we  have 

'~1  +  Wi  W2 

W3 


R~x 


Wf 


(C.20) 


where 


W,  =RxllRx.,W:iR"2Rvl|  (C.21) 

-  R^Yq  (i  +  P^EY^R^Yo)"1 
x  P^EY^Ry1, 

W2  =  —  R^Rv^Wa,  (C.22) 

w3  =  (rx,3  -  Rj2R^R Y,2)  .  (C.23) 


Inserting  the  above  results  in  (5.28)  followed  by  simple  manipulations,  we  can  see  that  the  para¬ 
metric  GLRT  test  statistic  (5.28)  is  equivalent  to  (5.29). 
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